WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 





PCT 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 6 : 

C12N 15/29, 5/04, 15/82, A01H 4/00, 5/00 



Al 



(11) International Publication Number: WO 99/31243 

(43) International Publication Date: 24 June 1999 (24.06.99) 



(21) International Application Number: PCT/US98/26784 

(22) International Filing Date: 16 December 1998 (16.12.98) 



(30) Priority Data: 

08/991,677 



1 6 December 1997 (16.1 2.97) US 



(71) Applicant (for all designated States except US): INTERNA- 

TIONAL PAPER COMPANY [US/US]; 2 Manhattanville 
Road, Purchase, NY 10577-2196 (US). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): CHIANG, Vincent, L. 
[US/US]; 1104 Birch Street, Hancock, MI 49930 (US). 
CARRAWAY, Daniel, T. [US/ US]; 1910 Rich Street, 
Bainbridge, GE 31717 (US), SMELTZER, Richard, H. 
[US/US1; 5036 Valley Farm Road, Tallahassee, FL 32303 
(US). 

(74) Agents: GRAHAM, Mark, S.; Luedeka, Neely & Graham, P.C.. 
P.O. Box 1871, Knoxville, TN 37901 (US) et al. 



(81) Designated States: BR, CA, CN, Fl, ID, MX, NO, NZ, PL, 
RU, SE, US, European patent (AT, BE, CH, CY, DE, DK, 
ES, FI, FR, GB, GR, IE, IT, LU, MC, NL, PT. SE). 



Published 

With international search report. 

Before the expiration of the time limit for amending the 
claims and to be republished in the event of the receipt of 
amendments. 



(54) Title: PRODUCTION OF SYRINGYL LIGNIN IN GYMNOSPERMS 



(57) Abstract 



The present invention relates to a method for producing syringyl lignin in gymnosperms. The production of syringyl lignin in 
gymnosperms is accomplished by genetically transforming a gymnosperm genome, which does not normally contain genes which code for 
enzymes necessary for production of syringyl lignin, with DNA which codes for enzymes found in angiosperms associated with production 
of syringyl lignin. The expression of the inserted DNA is mediated using host promoter regions in the gymnosperm. In addition, genetic 
sequences which code for gymnosperm lignin anti-sense mRNA may be incorporated into the gymnosperm genome in order to suppress 
the formation of the less preferred forms of lignin in the gymnosperm such as guaiacyl lignin. 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



AL 


Albania 


ES 


Spain 


LS 


Lesotho 


SI 


Slovenia 


AM 


Armenia 


FI 


Finland 


LT 


Lithuania 


SK 


Slovakia 


AT 


Austria 


FR 


France 


LU 


Luxembourg 


SN 


Senegal 


AU 


Australia 


GA 


Gabon 


LV 


Latvia 


sz 


Swaziland 


AZ 


Azerbaijan 


GB 


United Kingdom 


MC 


Monaco 


TD 


Chad 


BA 


Bosnia and Herzegovina 


CE 


Georgia 


MI) 


Republic of Moldova 


TG 


Togo 


BB 


Barbados 


GH 


Ghana 


MG 


Madagascar 


TJ 


Tajikistan 


BE 


Belgium 


GN 


Guinea 


MK 


The former Yugoslav 


TM 


Turkmenistan 


BF 


Burkina Faso 


GR 


Greece 




Republic of Macedonia 


TR 


Turkey 


BG 


Bulgaria 


HU 


Hungary 


ML 


Mali 


TT 


Trinidad and Tobago 


BJ 


Benin 


IE 


Ireland 


MN 


Mongolia 


UA 


Ukraine 


BR 


Brazil 


IL 


Israel 


MR 


Mauritania 


UG 


Uganda 


BY 


Belarus 


IS 


Iceland 


M\V 


Malawi 


US 


United States of America 


CA 


Canada 


IT 


Italy 


MX 


Mex ico 


uz 


Uzbekistan 


CF 


Central African Republic 


JP 


Japan 


NE 


Niger 


VN 


Viet Nam 


CG 


Congo 


KE 


Kenya 


NL 


Netherlands 


YU 


Yugoslavia 


CH 


Switzerland 


KG 


Kyrgyzstan 


NO 


Norway 


ZW 


Zimbabwe 


CI 


Cdte d'lvoire 


KP 


Democratic People's 


NZ 


New Zealand 






CM 


Cameroon 




Republic of Korea 


PL 


Poland 






CN 


China 


KR 


Republic of Korea 


PT 


Portugal 






CU 


Cuba 


KZ 


Kazakstan 


RO 


Romania 






cz 


Czech Republic 


LC 


Saint Lucia 


RU 


Russian Federation 






DE 


Germany 


LI 


Liechtenstein 


SD 


Sudan 






DK 


Denmark 


LK 


Sri Lanka 


SE 


Sweden 






EE 


Estonia 


LR 


Liberia 


SG 


Singapore 







WO 99/31243 



PCT/US98/26784 



PRODUCTION OF SYRINGYL LIGN1N IN GYMNOSPERMS 

Field of the Invention 

This application claims the benefit of U.S. Provisional Application number 60/033,381, filed 
December 16, 1996, The invention relates to the molecular modification of gymnosperms in order to 
5 cause the production of syringyl units during lignin biosynthesis and to production and propagation of 
gymnosperms containing syringyl lignin. 

Background of the Invention 

Lignin is a major part of the supportive structure of most woody plants including angiosperm 
and gymnosperm trees which in turn are the principal sources of fiber for making paper and cellulosic 

10 products. In order to liberate fibers from wood structure in a manner suitable for making many grades 
of paper, it is necessary to remove much of the lignin from the fiber/lignin network. Lignin is removed 
from wood chips by treatment of the chips in an alkaline solution at elevated temperatures and pressure 
in an initial step of papermaking processes. The rate of removal of lignin from wood of different tree 
species varies depending upon lignin structure. Three different lignin structures have been identified in 

15 trees: p-hydroxyphenyl, guaiacyl and syringyl, which are illustrated in Fig. 1. 

Angiosperm species, such as Liquidambar styraciflua L. [sweetgum], have lignin composed of a 
mixture of guaiacyl and syringyl monomer units. In contrast, gymnosperm species such as Pinus taeda 
L. [loblolly pine] have lignin which is devoid of syringyl monomer units. Generally speaking, the rate 
of delignification in a pulping process is directly proportional to the amount of syringyl lignin present in 

20 the wood. The higher delignification rates associated with species having a greater proportion of 

syringyl lignin result in more efficient pulp mill operations since the mills make better use of energy 
and capital investment and the environmental impact is lessened due to a decrease in chemicals used for 
delignification. 

It is therefore an object of the invention to provide gymnosperm species which are easier to 
25 delignify in pulping processes. 

Another object of the invention is to provide gymnosperm species such as loblolly pine which 
contain syringyl lignin. 

An additional object of the invention is to provide a method for modifying genes involved in 
lignin biosynthesis in gymnosperm species so that production of syringyl lignin is increased while 
30 production of guaiacyl lignin is suppressed. 

Still another object of the invention is to produce whole gymnosperm plants containing genes 
which increase production of syringyl lignin and repress production of guaiacyl lignin. 

Yet another object of the invention is to identify, isolate and/or clone those genes in 
angiosperms responsible for production of syringyl lignin. 
35 A further object of the invention is to provide, in gymnosperms, genes which produce syringyl 
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lignin. 

Another object of the invention is to provide a method for making an expression cassette 
insertable into a gymnosperm cell for the purpose of inducing formation of syringyl iignm in a 
gymnosperm plant derived from the ceil. 
PefipitjQPS 

The term "promoter" refers to a DNA sequence in the 5* flanking region of a given gene which 
is involved in recognition and binding of.RNA polymerase and other transcriptional proteins and is 
required to initiate DNA transcription in cells. 

The term "constitutive promoter" refers to a promoter which activates transcription of a desired 
gene, and is commonly used in creation of an expression cassette designed for preliminary experiments 
relative to testing of gene function. An example of a constitutive promoter is 35S CaMV. available 
from Clonetech. 

The term "expression cassette" refers to a double stranded DNA sequence which contains both 
promoters and genes such that expression of a given gene is acheived upon insertion of the expression 
15 cassette into a plant cell. 

The term "plant" includes whole plants and portions of plants, including plant organs (e.g. 
roots, stems, leaves, etc.) 

The term "angiosperm" refers to plants which produce seeds encased in an ovary. A specific 
example of an angiosperm is Liquidambar styraciflua (L.)fsweetgum]. The angiosperm sweetgum 
20 produces syringyl lignin. 

The term "gymnosperm 1 ' refers to plants which produce naked seeds, that is, seeds which are 
not encased in an ovary. A specific example of a gymnosperm is Pinus taeda (L.)[loblolly pine]. The 
gymnosperm loblolly pine does not produce syringyl lignin. 
Summary of the Invention 

25 With regard to the above and other objects, the invention provides a method for inducing 

production of syringyl lignin in gymnosperms and to gymnosperms which contain syringyl lignin for 
improved delignification in the production of pulp for papermaking and other applications. In 
accordance with one of its aspects, the invention involves cloning an angiosperm DNA sequence which 
codes for enzymes involved in production of syringyl lignin monomer units, fusing the angiosperm 

30 DNA sequence to a lignin promoter region to form an expression cassette, and inserting the expression 
cassette into a gymnosperm genome. 

Enzymes required for production of syringyl lignin in an angiosperm are obtained by deducing 
an amino acid sequence of the enzyme, extrapolating an mRNA sequence from the amino acid 
sequence, constructing a probe for the corresponding DNA sequence and cloning the DNA sequence 

35 which codes for the desired enzyme. A promoter region specific to a gymnosperm lignin biosynthesis 
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gene is identified by constructing a probe for a gymnosperm lignin biosynthesis gene, sequencing the 5' 
flanking region of the DNA which encodes the gymnosperm lignin biosynthesis gene to locate a 
promoter sequence, and then cloning that sequence. 

An expression cassette is constructed by fusing the angiosperm syringyl lignin DNA sequence 
to the gymnosperm promoter DNA sequence. Alternatively, the angiosperm syringyl lignin DNA is 
fused to a constitutive promoter to form an expression cassette. The expression cassette is inserted into 
the gymnosperm genome to transform the gymnosperm genome. Cells containing the transformed 
genome are selected and used to produce a transformed gymnosperm plant containing syringyl lignin. 

In accordance with the invention, the angiosperm gene sequences bi-OMT, 4CL, P450-1 and 
P450-2 have been determined and isolated as associated with production of syringyl lignin in sweetgum 
and lignin promoter regions for the gymnosperm loblolly pine have been determined to be the 5' 
flanking regions for the 4CL1B, 4CL3B and PAL gymnosperm lignin genes. Expression cassettes 
containing sequences of selected genes from sweetgum have been inserted into loblolly pine 
embryogenic cells and presence of sweetgum genes associated with production of syringyl lignin has 
been confirmed in daughter cells of the resulting loblolly pine embryogenic cells. 

The invention therefore enables production of gymnosperms such as loblolly pine containing 
genes which code for production of syringyl lignin, to thereby produce in such species syringyl lignin in 
the wood structure for enhanced pulpability. 
Brief Description of the Drawing s 

The above and other aspects of the invention will now be further described in the following 
detailed specification considered in conjunction with the following drawings in which: 

Fig. I illustrates a generalized pathway for lignin synthesis; and 

Fig. 2 illustrates a bifunctional-O-methyl transferase (bi-OMT) gene sequence involved in the 
production of syringyl lignin in an angiosperm (SEQ ID 5 and 6); 

Fig. 3 illustrates a 4-coumarate CoA ligase ( 4CL) gene sequence involved in the production of 
syringyl lignin in an angiosperm (SEQ ID 7 and 8); 

Fig. 4 illustrates a P450-1 gene sequence involved in the production of syringyl lignin in an 
angiosperm (SEQ ID I and 2); 

Fig. 5 illustrates a P450-2 gene sequence involved in the production of syringyl lignin in an 
angiosperm (SEQ ID 3 and 4); 

Fig. 6 illustrates nucleotide sequences of the 5' flanking region of the loblolly pine 4CL1B gene 
showing the location of regulatory elements for lignin biosynthesis (SEQ ID 10); 

Fig. 7 illustrates nucleotide sequences of the 5' flanking region of the loblolly pine 4CL3B gene 
showing the location of regulatory elements for lignin biosynthesis (SEQ ID 11); 

Fig. 8 illustrates nucleotide sequences of the 5 l flanking region of loblolly pine PAL gene 



WO 99/31243 



PCT/US98/26784 



showing the location of regulatory elements for iignin biosynthesis (SEQ ID 9); 

Fig. 9 illustrates a PCR confirmation of the sweetgum P450-1 gene sequence in transgenic 
loblolly pine cells. 

Detailed Description of the Invention 
5 In accordance with the invention, a method is provided for modifying a gymnosperm genome, 

such as the genome of a loblolly pine, so that syringyl lignin will be produced in the resulting plant, 
thereby enabling cellulosic fibers of the same to be more easily separated from lignin in a pulping 
process. In general, this is accomplished by fusing one or more angiosperm DNA sequences (referred 
to at times herein as the "ASL DNA sequences") which are involved in production of syringyl lignin to 
10 a gymnosperm lignin promoter region (referred to at times herein as the "GL promoter region") specific 
to genes involved in gymnosperm lignin biosynthesis to form a gymnosperm syringyl lignin expression 
cassette (referred to at times herein as the "GSL expression cassette"). Alternatively, the one or more 
ASL DNA sequences are fused to one or more constitutive promoters to form a GSL expression 
cassette. 

15 The GSL expression cassette preferably also includes selectable marker genes which enable 

transformed cells to be differentiated from untransformed cells. The GSL expression cassette 
containing selectable marker genes is inserted into the gymnosperm genome and transformed cells are 
identified and selected t from which whole gymnosperm plants may be produced which exhibit 
production of syringyl lignin. 

20 To suppress production of less preferred forms of lignin in gymnosperms, such as guaiacyl 

lignin, genes from the gymnosperm associated with production of these less preferred forms of lignin 
are identified, isolated and the DNA sequence coding for anti-sense mRNA (referred to at times herein 
as the "GL anti-sense sequence") for these genes is produced. The DNA sequence coding for anti-sense 
mRNA is then incorporated into the gymnosperm genome, which when expressed bind to the less 

25 preferred guaiacyl gymnosperm lignin mRNA, inactivating it. 

Further features of these and various other steps and procedures associated with practice of the 
invention will now be described in more detail beginning with identification and isolation of ASL DNA 
sequences of interest for use in inducing production of syringyl lignin in a gymnosperm. 
I. Determination of DNA Sequence For Genes Associated with Production of Svringvl Lignin 

30 The general biosynthetic pathway for production of lignin has been postulated as shown in Fig. 

1. From Fig. 1, it can be seen that the genes CCL, OMT and F5H (which is from the class of P450 
genes) may play key roles in production of syringyl lignin in some plant species, but their specific 
contributions and mechanisms remain to be positively established. It is suspected that the CCL, OMT 
and F5H genes may have specific equivalents in a specific angiosperm, such as sweetgum. 

35 Accordingly, one aim of the present invention is to identify, sequence and clone specific genes of 
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interest from an angiosperm such as sweetgum which are involved in production of syringyi iignin and 
to then introduce those genes into the genome of a gymnosperm, such as loblolly pine, to induce 
production of syringyi Iignin. 

Genes of interest may be identified in various ways, depending on how much information about 
the.;gene is already known. Genes believed to be associated with production of syringyi Iignin have 
already been sequenced from a few angiosperm species, viz, CCL and OMT. 

DNA sequences of the various CCL and OMT genes are compared to each other to determine if 
there are conserved regions. Once the conserved regions of the DNA sequences are identified, primers 
homologous to the conserved sequences are synthesized. Reverse transcription of the DNA-free total 
RNA which was purified from sweetgum xylem tissue, followed by double PCR using gene-specific 
primers, enables production of probes for the CCL and OMT genes. 

A sweetgum cDNA library is constructed in a host, such as lambda ZAPI1, available from 
Stratagene, of LaJolla, CA, using poly(A) -h RNA isolated from sweetgum xylem, according to the 
methods described by Bugos et al. (1995 Biotechniques 19:734-737). The above mentioned probes are 
used to assay the sweetgum cDNA library to locate cDNA which codes for enzymes involved in 
production of syringyi Iignin. Once a syringyi Iignin sequence is located, it is then cloned and 
sequenced according to known methods which are familiar to those of ordinary skill. 

In accordance with the invention, two sweetgum syringyi Iignin genes have been determined 
using the above-described technique. These genes have been designated 4CL and bi-OMT. The 
sequence obtained for the sweetgum syringyi Iignin gene, designated bi-OMT, is illustrated in Fig. 2 
(SEQ ID 5 and 6). The sequence obtained for the sweetgum syringyi Iignin gene, designated 4CL, is 
illustrated in Fig. 3 (SEQ ID 7 and 8). 

An alternative procedure was employed to identify the F5H equivalent genes in sweetgum. 
Because the DNA sequences for similar P450 genes from other plant species were known, probes for 
the P450 genes were designed based on the conserved regions found by comparing the known 
sequences for similar P450 genes. The known P450 sequences used for comparison include all plant 
P450 genes in the GenBank database. Primers were designed based on two highly conserved regions 
which are common to all known plant P450 genes. The primers were then used in a PCR reaction with 
the sweetgum cDNA library as a template. Once P450-like fragments were located, they were 
amplified using standard PCR techniques, cloned into a pBluescript vector available from Stratagene of 
LaJolla, CA and transformed into a DH5a E. coli strain available from Gibco BRL of Gaithersburg, 
MD. 

After E. coli colonies were tested in order to determine that they contained the P450-like DNA 
fragments, the fragments were sequenced. Several P450-like sequences were located in sweetgum 
using the above described technique. One P450-like sequence was sufficiently different from other 



WO 99/31243 



PCT/US98/26784 



known P450 sequences to indicate that it represented a new P450 gene family. This potentially new 
P450 cDNA fragment was used as a probe to screen two full length clones from the sweetgum xylem 
cDNA library. These putative hydroxylase clones were designated P450-1 and P450-2. The sequences 
obtained for P450-1 and P450-2 are illustrated in Fig. 4 (SEQ ID 1 and 2) and Fig. 5 (SEQ ID 3 and 
5 4). 

II. Identification of GL Gene Promoter Regions 

In order to locate gymnosperm lignin promoter regions, probes are developed to locate lignin 
genes. After the gymnosperm lignin gene is located, the portion of DNA upstream from the gene is 
sequenced, preferably using the GenomeWalker Kit, available from Clontech. The portion of DNA 

10 upstream from the lignin gene will generally contain the gymnosperm lignin promoter region. 

Gymnosperm genes of interest include CCL-like genes and PAL-like genes, which are beleived 
to be involved in the production of lignin in gymnosperms. Preferred probe sequences are developed 
based on previously sequenced genes, which are available from the gene bank. The preferred gene 
bank accession numbers for the CCL-like genes include U39404 and U39405. A preferred gene bank 

15 accession number for a PAL-like gene is U39792. Probes for such genes are constructed according to 
methods familiar to those of ordinary skill in the art. A genomic DNA library is constructed and DNA 
fragments which code for gymnosperm lignin genes are then identified using the above mentioned 
probes. A preferred DNA library is obtained from the gymnosperm, Pinus taeda (L.)[Loblolly Pine], 
and a preferred host of the genomic library is Lambda Dashll, available from Stratagene of LaJoila, 

20 CA. 

Once the DNA fragments which code for the gymnosperm lignin genes are located, the 
genomic region upstream from the gymnosperm lignin gene (the 5' flanking region) was identified. This 
region contains the GL promoter. Three promoter regions were located from gymnosperm lignin 
biosynthesis genes. The first is the 5'flanking region of the loblolly pine 4CL1B gene, shown in Fig. 6 
25 (SEQ ID 10). The second is the 5' flanking region of the loblolly pine gene 4CL3B, shown in Fig. 7 
(SEQ ID 11). The third is the 5* flanking region of the loblolly pine gene PAL, shown in Fig. 8 (SEQ 
ID 9). 

III. Fusing the GL Promoter Region to the ASL DNA Sequence 

The next step of the process is to fuse the GL promoter region to the ASL DNA sequence to 

30 make a GSL expression cassette for insertion into the genome of a gymnosperm. This may be 

accomplished by standard techniques. In a preferred method, the GL promoter region is first cloned 
into a suitable vector. Preferred vectors are pGEM7Z, available from Promega, Madison, WI and SK 
available from Stratagene, of LaJoila, CA. After the promoter sequence is cloned into the vector, it is 
then released with suitable restriction enzymes. The ASL DNA sequence is released with the same 

35 restriction enzyme(s) and purified. 
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The GL promoter region sequence and the ASL DNA sequence are then ligated such as with T4 
DNA ligase. available from Promega, to form the GSL expression cassette. Fusion of the GL and ASL 
DNA sequence is confirmed by restriction enzyme digestion and DNA sequencing. After confirmation 
of GL promoter-ASL DNA fusion, the GSL expression cassette is released from the original vector with 
suitable restriction enzymes and used in construction of vectors for plant transformation. 
IV. Fusing the ASL DNA Sequence to a Consti tutive Promoter Region 

In an alternative embodiment, a standard constitutive promoter may be fused with the ASL 
DNA sequence to make a GSL expression cassette. For example, a standard constitutive promoter may 
be fused with P450-1 to form an expression cassette for insertion of P450-1 sequences into a 
gymnosperni genome. In addition, a standard constitutive promoter may be fused with P450-2 to form 
an expression cassette for insertion of P450-2 into a gymnosperm genome. A constitutive promoter for 
use in the invention is the double 35S promoter. 

In the preferred practice of the invention using constitutive promoters, a suitable vector such as 
pBI22K is digested by Xbal and Hindlll to release the 35S promoter. At the same time the vector 
pHygro, available from International Paper, was digested by Xbal and Hindlll to release the double 35S 
promoter. The double 35S promoter was ligated to the previously digested pBI221 vector to produce a 
new pBI221 with the double 35S promoter. This new pBI221 was digested with Sad and Smal, to 
release the GUS fragment. The vector is next treated with T4 DNA polymerase to produce blunt ends 
and the vector is self-iigated. This vector is then further digested with BamHI and Xbal, available from 
Promega. After the pBI221 vector containing the constitutive promoter region has been prepared, 
lignin gene sequences are prepared for insertion into the pB1221 vector. 

The coding regions of sweetgum P450-1 or P450-2 are amplified by PCR using primer with 
restriction sites incorporated in the 5' and 3' ends. In one example, an Xbal site was incorporated at 
the 5' end and a BamHI site was incorporated at the 3' end of the sweetgum P450-1 or P450-2 genes. 
After PCR, the P450-1 and P450-2 genes were separately cloned into a TA vector available from 
Invitrogen. The TA vectors containing the P450-1 and P450-2 genes, respectively, were digested by 
Xbal and BamHI to release the P450-1 or P450-2 sequences. 

The p35SS vector, described above, and the isolated sweetgum P450-1 or P450-2 fragments 
were then ligated to make GLS expression cassettes containing the constitutive promoter. 
V. inserti ng the Expression Cassette into the Gymnosperm Genome 

There are a number of methods by which the GSL expression cassette may be inserted into a 
target gymnosperm cell. One method of inserting the expression cassette into the gymnosperm is by 
micro-projectile bombardment of gymnosperm cells. For example, embryogenic tissue cultures of 
loblolly pine may be initiated from immature zygotic embryos. Tissue is maintained in an 
undifferentiated state on semi-solid proliferation medium. For transformation, embryogenic tissue is 
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suspended in liquid proliferation medium. Cells are then sieved through, a preferably 40 mesh screen, 
to separate small, densely cytoplasmic cells from large vacuolar cells. 

After separation, a portion of the liquid cell suspension fraction is vacuum deposited onto filter 
paper and placed on semi-solid proliferation medium. The prepared gymnosperm target cells are then 
5 grown for several days on filter paper discs in a petri dish. 

A 1:1 mixture of plasmid DNA containing the selectable marker expression cassette and 
plasmid DNA containing the P450-1 expression cassette may be precipitated with gold to form 
microprojectiies. The microprojectiles are rinsed in absolute ethanol and aliqots are dried onto a 
suitable macrocarrier such as the macrocarrier available from BioRad in Hercules, CA. 

10 Prior to bombardment, embryogenic tissue is preferably desiccated under a sterile laminar-flow 

hood. The desiccated tissue is transferred to semi-solid proliferation medium. The prepared 
microprojectiles are accelerated from the macrocarrier into the desiccated target cells using a suitable 
apparatus such as a BioRad PDS-1000/HE particle gun. In a preferred method, each plate is 
bombarded once, rotated 180 degrees, and bombarded a second time. Preferred bombardment 

15 parameters are 1350 psi rupture disc pressure, 6 mm distance from the rupture disc to macrocarrier 
(gap distance), 1 cm macrocarrier travel distance, and 10 cm distance from macrocarrier stopping 
screen to culture plate (microcarrier travel distance). Tissue is then transferred to semi-solid 
proliferation medium containing a selection agent, such as hygromycin B, for two days after 
bombardment. 

20 Other methods of inserting the GSL expression cassette include use of silicon carbide whiskers, 

transformed protoplasts, Agrobacterium vectors and electroporation. 
VI. Identifying Transformed Cells 

In general, insertion of the GSL expression cassette will typically be carried out in a mass of 
cells and it will be necessary to determine which cells harbor the recombinant DNA molecule 

25 containing the GSL expression cassette. Transformed cells are first identified by their ability to grow 
vigorously on a medium containing an antibiotic which is toxic to non-transformed cells. Preferred 
antibiotics are kanamycin and hygromycin B. Cells which grow vigorously on antibiotic containing 
medium are further tested for presence of either portions of the plasmid vector, the syringyl lignin 
genes in die GSL expression cassette; e.g. the angiosperm bi-OMT, 4CL, P450-1 or P450-2 gene, or 

30 by testing for presence of other fragments in the GSL expression cassette. Specific methods which can 
be used to test for presence of portions of the GSL expression cassette include Southern blotting with a 
labeled complementary probe or PCR amplification with specific complementary primers. In yet 
another approach, an expressed syringyl lignin enzyme can be detected by Western blotting with a 
specific antibody, or by assaying for a functional property such as the appearance of functional 

35 enzymatic activity. 
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VH. Production of a Gymnosperm Plant from the Transformed Gymnosperm Cell 

Once transformed embryogenic cells of the gymnosperm have been identified, isolated and 

multiplied, they may be grown into plants. It is expected that all plants resulting from transformed cells 

will contain the GSL expression cassette in all their cells, and that wood in the secondary growth stage 

of the mature plant will be characterized by the presence of syringyl lignin. 

Transgenic embryogenic cells are allowed to replicate and develop into a somatic embryo, 

which are then converted into a somatic seedling. 

V.HI identification. Production and Insertion of a GL mRNA Anti-Sense Sequence 

In addition to adding ASL DNA sequences, anti-sense sequences may be incorporated into a 
gymnosperm genome, via GSL expression cassettes, in order to suppress formation of the less preferred 
native gymnosperm lignin. To this end, the gymnosperm lignin gene is first located and sequenced in 
order to determine its nucleotide sequence. Methods for locating and sequencing amino acids which 
have been previously discussed may be employed. For example, if the gymnosperm lignin gene has 
already been purified, standard sequencing methods may be employed to determine the DNA nucleic 
acid sequence. 

If the gymnosperm lignin gene has not been purified and functionally similar DNA or mRNA 
sequences from similar species are known, those sequences may be compared to identify highly 
conserved regions and this information used as a basis for the construction of a probe. A gymnosperm 
cDNA or genomic library can be probed with the above mentioned sequences to locate the gymnosperm 
lignin cDNA or genomic DNA. Once the gymnosperm lignin DNA is located, it may be sequenced 
using standard sequencing methods. 

After the DNA sequence has been obtained for a gymnosperm lignin sequence, the 
complementary anti-sense strand is constructed and incorporated into an expression cassette. For 
example, the GL mRNA anti-sense sequence may be fused to a promoter region to form an expression 
cassette as described above. In a preferred method, the GL mRNA anti-sense sequence is incorporated 
into the previously discussed GSL expression cassette which is inserted into the gymnosperm genome as 
described above. 

IX. Inclusion of Cytochrome P45Q Reductase (CPR) to Enhance Bios ynthesis of Svringvl Lignin in 

Gymno?perm$ 

In the absence of external cofactors such as NADPH (an electron donor in reductive 
biosyntheses), certain angiosperm lignin genes such as the P450 genes may remain inactive or not 
acheive full or desired activity after insertion into the genome of a gymnosperm. Inactivity or 
insufficient activity can be determined by testing the resulting plant which contains the P450 genes for 
the presence of syringyl lignin in secondary growth. It is known that cytochrome P450 reductase (CPR) 
may be involved in promoting certain reductive biochemical reactions, and may activate the desired 
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expression of genes in many plants. Accordingly, if it is desired to enhance the expression of the 
angiosperm syringyl lignin genes in the gymnosperm, CPR may be inserted in the gymnosperm 
genome. In order to express CPR. the DNA sequence of the enzyme is ligated to a constitutive 
promoter or, for a specific species such as loblolly pine, xylem-specific lignin promoters such as PAL, 
5 4CL1B or 4CL3B to form an expression cassette. The expression cassette may then be inserted into the 
gymnosperm genome by various methods as described above. 
2L Examples 

The following non-limiting examples illustrate further aspects of the invention. In these 
examples, the angiosperm is Liquidambar styraciflua (L.)[sweetgum] and the gymnosperm is Pinus 
10 taeda (L.)[loblolly pine]. The nomenclature for the genes referred to in the examples is as follows: 



■' Genes 


Biochemical Name 


4CL (angiosperm) 


4-coumarate CoA ligase 


bi-OMT (angiosperm) 


bifunctional-O-methyl transferase 


P450-1 (angiosperm) 


cytochrome P450 


P450-2 (angiosperm) 


cytochrome P450 


PAL (gymnosperm) 


phenylalanine ammonia-lyase 


4CL1B (gymnosperm) 


4-coumarate CoA ligase 


4CL3B (gymnosperm) 


4-coumarate CoA ligase 



20 

Example 1 - Isolating and Sequencing bi-OMT a nd 4CL Genes from an Angiosperm 

A cDNA library for Sweetgum was constructed in Lambda ZAPII, available from Stratagene, 

of LaJolla, CA, using poly(A) +RNA isolated from Sweetgum xylem tissue. Probes for bi-OMT and 

4CL were obtained through reverse transcription of their mRNAs and followed by double PCR using 
25 gene-specific primers which were designed based on the OMT and CCL cDNA sequences obtained 

from similar genes cloned from other species. 

Four primers were used for amplifying OMT fragments. One was an oligo-dT primer. One 

was a bi-OMT primer, (which was used to clone gene fragments through modified differential display 

technique, as described below in Example 2) and the other two were degenerate primers, which were 
30 based on the conserved sequences of all known OMTs. The two degenerate primers were derived 

based on the following amino acid sequences: 

5*- Gly Gly Met Ala Thr Tyr Cys Cys Ala Thr Thr Tyr Ala Ala Cys Ala Ala Gly Gly Cys-3' 

(primer #22) and 

3'-Ala Ala Ala Gly Ala Gly Ala Gly Asn Ala Cys Asn Asn Ala Asn Asn Ala Asn Gly Ala-5' 
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(primer #23). 

A 900 bp PCR product was produced when oligo-dT primer and primer #22 were used, and a 
550 bp fragment was produced when primer numbers 22 and 23 were used. 

Three primers were used for amplifying CCL fragments. They were derived from the 
5 fqjiowing amino acid sequences: 

5'-Thr Thr Gly Gly Ala Thr Cys Cys Gly Gly He Ala Cys He Ala Cys He Gly Giy He Tyr Thr 
lie Cys Cys He Ala Ala Arg Gly Gly-3* (primer RIS) 

5*-Thr Thr Gly Gly Ala Thr Cys Cys Gly Thr He Gly Thr He Gly Cys He Cys Ala Arg Cys 
10 Ala Arg Gly Thr He Gly Ala Tyr Gly Gly-3* (primer HIS) and 

3*-Cys Cys He Cys Thr Tyr Thr Ala Asp Ala Cys Arg Thr Aia Asp Gly Cys He Cys Cys Ala 
Gly Cys Thr Gly Thr Ala-5' (primer R2A) 

RIS and HIS were both sense primers. Primer R2A was an anti-sense primer. A 650 bp 
15 fragment was produced if RIS and R2A primers were used and a 550 bp fragment was produced when 
primers HIS and R2A were used. The sequence of these three primers were derived from conserved 
sequences for plant CCLs. 

The reverse transcription-double PCR cloning technique used for these examples consisted of - 
adding 10 jig of DNA-free total RNA in 25/xl DEPC-treated water to a microfuge tube. Next, the 
20 following solutions were added: 

a. 5x Reverse transcript buffer 8.0^1, 

b. 0.1 M DTT4.0 t*I 

c. IOmMdNTP2.0/J 

d. 100 txM oligo-dT primers 8.0 i*l 
25 e. Rnasin 2.0^1 

f. Superscript II L0 y\ 

After mixing, the tube was incubated at a temperature of 42° C for one (1) hour, followed by 
incubation at 70° C for fifteen (15) minutes. Forty (40) ill of IN NaOH was added and the tube was 

30 further incubated at 68° C for twenty (20) minutes. After the incubation periods, 80 /xl of IN HC1 was 
added to the reaction mixture. At the same time, 17 p\ NaOAc, 5 i*l glycogen and 768 td of 100% 
ethanol were added and the reaction mixture was maintained at -80° C for 15 minutes in order to 
precipitate the cDNA. The precipitated cDNA was centrifuged at high speed at 4° C for 15 minutes. 
The resulting pellet was washed with 70% ethanol and then dried at room temperature, and then was 

35 dissolved in 20 itl of water. 

The foregoing procedure produced purified cDNA which was used as a template to carry out 
first round PCR using primers #22 and oligo-dT for cloning OMT cDNA and primer RIS and R2A for 
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cloning 4CL cDNA. For the first round PCR, a master mix of 50fx\ for each reaction was prepared. 
Each 5(V1 mixture contained: 

a. lOx buffer 5/xl 

b. 25 mM MgCl 2 5/xl 

5 c. 100 jxM sense primer (primer #22 for OMT and primer R1S for CCL). 

d. 100 fi\ anti-sense primer 1 fil (oligo-dT primer for OMT and R2A for CCL). 

e. 10 mM dNTP 1 M l 

f. Taq. DNA polymerase 0.5 pA 

10 Of this master mix, 48 jxl was added into a PGR tube containing 2 fi\ of cDNA for PCR. The 

tube was heated to 95° C for 45 seconds, 52°C for one minute and 72° C for two minutes. This 
temperature cycle was repeated for 40 cycles and the mixture was then held at 72° C for 10 minutes. 

The cDNA fragments obtained from the first round of PCR were used as templates to perform 
the second round of PCR using primers 22 and 23 for cloning bi-OMT cDNA and primer HIS and R2A 

15 for cloning 4CL cDNA. The second round of PCR conditions were the same as the first round. 

The desired cDNA fragment was then sub-cloned and sequenced. After the second round of 
PCR, the product with the predicted size was excised from the gel and ligated into a pUC19 vector, 
available from Clonetech, of Palo Alto, CA, and then transformed into DHSoc, an £. coli strain , 
available from Gibco BRL, of Gaithersburg, MD. After the inserts had been checked for correct size, 

20 the colonies were isolated and plasmids were sequenced using a Sequenase kit available from USB, of 
Cleveland, OH. The sequences are shown in Fig. 2 (SEQ ID 5 and 6) and Fig. 3 (SEQ ID 7 and 8). 

Example 2 - Alternative Isolation Method of Angiosperm bi-OMT gene 

As previously mentioned, one bi-OMT clone was produced via modified differential display 
25 technique. This method is another type of reverse transcription-PCR, in which DNA-free total RNA 
was reverse transcribed using oligo-dT primers with a single base pair anchor to form cDNA. The 
oligo-dT primers used for reverse transcription of mRNA to synthesize cDNA were: 
Til A: TTTTTTTTTTTA, 
T11C: TTTTTTTTTTTC, and 
30 Tl 1G: TTTTTTTTTTTG, 

These cDNAs were then used as templates for radioactive PCR which was conducted in the 
presence of the same oligo-dT primers as listed above, a bi-OMT gene-specific primer and 35S-dATP. 
The OMT gene-specific primer was derived from the following amino acid sequence: 5'-Cys Cys Asn 
Gly Gly Asn Gly Gly Ser Ala Arg Gly Ala-3*. 
35 The following PCR reaction solutions were combined in a microfuge tube: 

a. H 2 0 9.2^1, 

b. Taq Buffer 2.0/xl 

c. dNTP(25 M M) 1.6/tl 
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d. Primers (5 pM) 2 fi\, for each primer 

e. 35 S-dATP 1 M I 

f. Taq. pol. 0.2/il 

g. cDNA 2.0/J. 

5 

The tube was heated to a temperature of 94° C and held for 45 seconds, then at 37° C for 2 
minutes and then 72°C for 45 seconds for forty cycles, followed by a final reaction at 72°C for 5 
minutes. 

The amplified products were fractionated on a denaturing polyacrylamide sequencing gel and 
10 autoradiography was used to identify and excise the fragments with a predicted size. The designed 

OMT gene-specific primer had a sequence conserved in a region toward the 3* -end of the OMT cDNA 
sequence. This primer, together with oligo-dT, was amplified into a OMT cDNA fragment of about 
300 bp. 

Three oligo-dTs with a single base pair of A, C or G, respectively, were used to pair with the 

15 OMT gene-specific primer. Eight potential OMT cDNA fragments with predicted sizes of about 300 bp 
were excised from the gels after several independent PCR rounds using different combinations of 
oligo-dT and OMT gene-specific oligonucleotides as primers. 

The OMT cDNA fragments were then re-amplified. A Southern blot analysis was performed 
for the resulting cDNAs using a 360 base-pair, 32 P radio-isotope labeled, aspen OMT cDNA 3*-end 

20 fragment as a probe to identify the cDNA fragments having a strong hybridization signal, under low 
stringency conditions. Eight fragments were identified. Out of these eight cDNA fragments, three 
were selected based on their high hybridization signal for sub-cloning and sequencing. One clone, 
LsOMT3'-l, (where the M Ls M prefix indicates that the clone was derived from the Liquidambar 
styraciflua (L.) genome) was confirmed to encode bi-OMT based on its high homology to other 

25 lignin-specific plant OMTs at both nucleotide and amino acid sequence levels. 

A cDNA library was constructed in Lambda ZAP II, available from Stratagene, of LaJolla, 
CA, using 5jig poly(A) + RNA isolated from sweetgum xylem tissue. The primary library consisting of 
approximately 0.7 x 10 6 independent recombinants was amplified and approximately 10 s 
plaque-forming -units (pfu) were screened using a homologous 550 base-pair probe. The hybridized 

30 filter was washed at high stringency (0.25 x SSC, 0. 1 % SDS, 65° C) conditions. The colony 

containing the bi-OMT fragment identified by the probe was eluted and the bi-OMT fragment was 

produced. The sequence as illustrated in Fig. 2 (SEQ ID 5 and 6) was obtained. 

Example 3 - Isolating and Producing the DNA which codes for the Ang ios perm P450-1 Gene 

In order to find putative P450 cDNA fragments as probes for cDNA library screening, a highly 

35 degenerated sense primer based on the amino acid sequence of 5'-Glu, Glu, Phe, Arg, Pro, Glu, Arg-3' 
was designed based on the conserved regions found in some plant P450 proteins. This conserved 
domain was located upstream of another highly conserved region in P450 proteins, which had an amino 
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acid sequence of 5*-Phe Gly Xaa Giy Xaa Xaa Cys Xaa Gly-3' . This primer was synthesized with the 
incorporation of an Xbol restriction site to give a 26-base-pair oligomer. 

This primer and the oligo-dT-XhoI primer were then used to perform PCR reactions with the 
sweetgum cDNA library as a template. The cDNA library was constructed in Lambda ZAPII, available 
5 from Stratagene, of LaJolla, CA, using poly(A) + RNA isolated from Sweetgum xylem tissue. 
Amplified fragments of 300 to 600 bp were obtained. Because the designed primer was located 
upstream of the highly conserved P450 domain, this design distinguished whether the PCR products 
were P450 gene fragments depending on whether they contained the highly conserved amino acid 
domain. 

*0 All the fragments obtained from the PCR reaction were then cloned into a pUC19 vector, 

available from New England Biolab, Beverly, MA, and transformed into a DH5a E. coli strain, 
available from Gibco BRL, of Gaithersburg, MD. 

Twenty-four positive colonies were obtained and sequenced. Sequence analysis indicated four 
groupings within the twenty-four colonies. One was C4H, one was an unknown P450 gene, and two 

15 did not belong to P450 genes. Homologies of P450 genes in different species are usually more than 
80%. Because the homologies between the P450 gene families found here were around 40%, the 
sequence analysis indicated that a new P450 gene family was sequenced. Moreover, since this P450 
cDNA was isolated from xylem tissue, it was highly probable that this P450 gene was P450-1. 

The novel sweetgum P450 cDNA fragment was used as a probe to screen a full length cDNA 

20 encoding for P450-1. Once the P450-1 gene was located it was sequenced. The length of the P450-1 
cDNA is 1707 bp and it contains 45 bp of 5' non-coding region and 135 bp of 3' non-coding region. 
The deduced amino acid sequence also indicates that this P450 cDNA has a hydrophobic core at the 
N-terminal, which could be regarded as a leader sequence for c-translational targeting to membranes 
during protein synthesis. At the C-terminal region, there is a heme binding domain that is characteristic 

25 of all P450 genes. The P450-1 sequence, as illustrated in Fig. 4 (SEQ ID 1 and 2), was produced, 
according to the above described methods. 

Example 4 - Isolating and Producing the DNA which codes for the Anpiosperm P45Q-2 Gene 

By using similar strategy of synthesizing PCR primers from the published literature for 
hydroxylase genes in plants, another full length P450 cDNA has been isolated that shows significant 
30 similarity with a putitive F5H clone from Arabidopsis (Meyers et al. 1996: PNAS 93, 6869-6874). 
This cloned cDNA, designated P450-2, contains 1883 bp and encodes an open reading frame of 51 1 
amino acids. The amino acid similarity shared between Arabidopsis F5H and the P450-2 sweetgum 
clone is about 75 % . 

To confirm the function of the FA5H-2 gene, it was expressed in E.coli, strain, DH5 alpha, via 
35 pQE vector preparation, according to directions available with the kit. A CO-Fe 2+ binding assay was 
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also performed to confirm the expression of P450-2 as a functional P450 gene. (Omura & Sato 1964, 
J. of Biochemistry 239: 2370-2378. Babriac et.al. 1991 Archives of Biochemistry and Biophysics 
288:302-309). The CO-Fe 2 * binding assay showed a peak at 450nm which indicates that P450-2 has 
been overexpressed as a funciional P450 gene. 

The P450-2 protein was further purified for production of antibodies in rabbits, and antibodies 
have been successfully produced. In addition, Western blots show that this antibody is specific to the 
membrane fraction of sweetgum and aspen xylem extract. When the P450-2 antibody was added to a 
reaction mixture containing aspen xylem tissue, enzyme inhibition studies showed that the activity of 
FA5H in aspen was reduced more than 60% , a further indication that P450-2 performs a P450-like 
function. Recombinant P450-2 protein co-expressed with Arabidopsis CPR protein in a baculovirus 
expression system hydroxylated ferulic acid (specific activity: 7.3 pKat/mg protein), cinnamic acid 
(specific activity: 25 pKat/mg protein), and p-coumaric acid (specific activity: 3.8 pKat/mg protein). 
The P450-2 enzyme which may be referred to as C4C3F5-H appears to be a broad spectrum 
hydroxylase in the phenylproponoid pathway in plants. Fig. 5 (SEQ ID 3 and 4) illustrates the P450-2 
sequence. 

Example 5 - Identifying Gvmnosperm Promoter Regions 

In order to identify gymnosperm promoter regions, sequences from loblolly pine PAL and 
4CL1B and 4CL3B iignin genes were used as primers to screen the loblolly pine genomic library, using' 
the GenomeWalker Kit. The loblolly pine PAL primer sequence was obtained from the GenBank, 
reference number U39792. The loblolly pine 4CL1B primer sequences were also obtained from the 
gene bank, reference numbers U39404 and U39405. 

The loblolly pine genomic library was constructed in Lambda DashIL available from 
Stratagene, of LaJolIa, CA. 3 x 10 6 phage plaques from the genomic library of loblolly pine were 
screened using both the above mentioned PAL cDNA and 4CL (PCR clone) fragments as probes. Five 
4CL clones were obtained after screening. Lambda DNAs of two 4CL of the five 4CL clones obtained 
after screening were isolated and digested by EcoRV, PstI, Sail and Xbal for Southern analysis. 
Southern analysis using 4CL fragments as probes indicated that both clones for the 4CL gene were 
identical. Results from further mapping showed that none of the original five 4CL clones contained 
promoter regions. When tested, the PAL clones obtained from the screening also did not contain 
promoter regions. 

In a second attempt to clone the promoter regions associated with the PAL and 4CL a Universal 
GenomeWalker(TM) kit, available from Clontech, was used. In the process, total DNA from loblolly 
pine was digested by several restriction enzymes and ligated into the adaptors (libraries) provided with 
the kit. Two gene-specific primers for each gene were designed (GSP1 and 2). After two rounds of 
PCR using these primers and adapter primers of the kit, several fragments were amplified from each 
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library. A 1 .6 kb fragment and a 0.6 kb fragment for PAL gene and a 2.3 kb fragment (4CL1B) and a 

0.7 kb fragment (4CL3B) for the 4CL gene were cloned, sequenced and found to contain promoter 

regions for all three genes. See Fig. 6 (SEQ ID 10), 7 (SEQ ID 1 1) and 8 (SEQ ID 9). 

Example 6 - Fusing the ASL DNA Sequence to A Constitutive Promoter Region and Inserting the 
5 Expression Cassette In to a Gymno sperm Genome 

As a first step, a ASL DNA sequence, P450-1, was fused with a constitutive promoter region 
according to the methods described in the above Section IV to form an P450-1 expression cassette. A 
second ASL DNA sequence, P450-2, was then fused with a constitutive promoter in the same manner 

10 to form a P450-2 expression cassette. The P450-1 expression cassette was inserted into the 

gymnosperm genome by micro-projectile bombardment. Embryogenic tissue cultures of loblolly pine 
were initiated from immature zygotic embryos. The tissue was maintained in an undifferentiated state 
on semi-solid proliferation medium, according to methods described by Newton et al. TAES Technic al 
Publication "Somatic Embryogenesis in Slash Pine", 1995 and Keinonen-Mettala et al. 1996, Scand. J. 

15 For. Res. 11: 242-250. 

After separation, 5 ml of the liquid cell suspension fraction which passes through the 40 mesh 
screen was vacuum deposited onto filter paper and placed on semi-solid proliferation medium. The 
prepared gymnosperm target cells were then grown for 2 days on filter paper discs placed on semi-solid 
proliferation medium in a petri dish. These target cells were then bombarded with plasmid DNA 

20 containing the P450-1 expression cassette and an expression cassette containing a selectable marker 

gene encoding the enzyme which confers resistance to the antibiotic hygromycin B. A 1 : 1 mixture of 
of selectable marker expression cassette and plasmid DNA containing the P450-1 expression cassette is 
precipitated with gold (1.5-3.0 microns) as described by Sanford et al. (1992). The DNA-coated 
microprojectiles were rinsed in absolute ethanol and aliquots of 10 /xl (5 ^g DNA/3mg gold) were dried 

25 onto a macrocarrier, such as those available from BioRad (Hercules, CA). 

Prior to bombardment, embryogenic tissue was desiccated under a sterile laminar-flow hood 
for 5 minutes. The desiccated tissue was transferred to semi-solid proliferation medium. The 
microprojectiles were accelerated into desiccated target cells using a BioRad PDS-1000/HE particle 
gun. 

30 Each plate was bombarded once, rotated 180 degrees, and bombarded a second time. Preferred 

bombardment parameters were 1350 psi rupture disc pressure, 6 mm distance from the rupture disc to 
macrocarrier (gap distance), 1 cm macrocarrier travel distance, and 10 cm distance from macrocarrier 
stopping screen to culture plate (microcarrier travel distance). Tissue was then transferred to semi-solid 
proliferation medium containing hygromycin B for two days after bombardment. 

35 The P450-2 expression cassette was inserted into the gymnosperm genome according to the 

same procedures. 
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Example 7 - Selecting T ransformed Target Cells 

After insertion of the P450-2 expression cassette and the selectable marker expression cassette 
into the gymnosperm target cells as described in Example 6, transformed cells were selected by 
exposure to an antibiotic that causes mortality of any cells not containing the GSL expression cassette. 
Forty independent cell lines were established from cultures co-bombarded with an expression cassette 
containing a hygromycin resistance gene construct and the P450-1 construct. These cell lines include 
lines Y2, Y17, Y7 and 04 t as discussed in more detail below. 

PCR techniques were then used to verify that the P450-1 gene had been successfully integrated 
into the genomes of the established cell lines by extracting genomic DNA using the Plant DNAeasy kit, 
available from Qiagen. 200 ng DNA from each cell line were used for each PCR reaction. Two 
P450-1 specific primers were designed to perform a PCR reaction with a 600bp PCR product size. 
The primers were: 

LsP450-iml-S primer: ATGGCTTTCCTTCTAATACCCATCTC , and 
LsP450-iml-A primer: GGGTGT A ATGG ACG AGC A AGG ACTTG . 

Each PCR reaction (100 fi\) consisted of 75 /xl H 2 0, 1 ^\ MgCI 2 (25 mM), 10 fi\ PCR buffer 1 
/a! lOmM dNTPs, and 10 /xl DNA. 100 *tl oil was layered on the top of each reaction mix. Hot start 
PCR was done as follows: PCR reaction was incubated at 95 degrees C for 7 minutes and 1 y\ each of 
both LsP450-iml-S and LsP450-iml-A primers (100 fiM stock) and 1 fi\ of Taq polymerase were added' 
through oil in each reaction. The PCR program used was 95 degrees C for 1.5 minutes, 55 degrees C J 
for 45 seconds and 72 degrees C for 2 minutes, repeated for 40 cycles, followed by extension at 72 
degrees C for 10 minutes. 

The above PCR products were employed to determine if gymnosperm cells contained the 
angiosperm lignin gene sequences. With reference to Fig. 9, PCR amplification was performed using 
template DNA from cells which grew vigorously on hygromycin B-containing medium. The PCR 
products were electrophoresed in an agarose gel containing 9 lanes. Lanes 1-4 contained PCR 
amplification of products of the Sweetgum P450-1 gene from a non-transformed control and transgenic 
loblolly pine cell lines. Lane 1 contained the non-transformed control PT52. Lane 2 contained 
transgenic line Y2. Lane 3 contained transgenic line Y17 and Lane 4 contained the plasmid which 
contains the expression cassette pSSLsP450-l-im-s. Lanes 2 through 4 all contain an amplified fragment 
of about 600 bp, indicating that the P450-1 gene has been successfully inserted into transgenic cell lines 
Y2 and Y17. 

Lane 5 contained a DNA size marker Phi 174/HaeIII (BRL). The top four bands in this lane 
indicate molecular sizes of 1353, 1078, 872 and 603 bp. 

Lanes 6-9 contained PCR amplification products of hygromycin B gene from non-transformed 
control and transgenic loblolly pine cell lines. Lane 6 contained the non-transformed control line 
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referenced to as PT52. Lane 7 contained transgenic line Y7. Lane 8 contained transgenic line 04. 
Lane 9 contained the plasmid which includes the expression cassette containing the gene encoding the 
enzyme which confers resistance to the antibiotic hygromycin B. Lanes 7-9 all show an amplified 
fragment of about lOOObp, indicating that the hygromycin gene has been successfully inserted into 
transgenic lines Y7 and 04. 

These PCR results confirmed the presence of P450-1 and hygromycin resistance gene in 
transformed loblolly pine cell cultures. The results obtained from the PCR verification of 4 cell lines, 
and similar tests with the remaining 36 cell lines, confirm stable integration of the P450-1 gene and 
the hygromycin B gene in 25% of the 40 cell lines. 

In addition, loblolly pine embryogenic cells which have been co-bombarded with the P450-2 
and hygromycin B expression cassettes, are growing vigorously on hygromycin selection medium, 
indicating that the P450-2 expression cassette was successfully integrated into the gymnosperm genome. 

Although various embodiments and features of the invention have been described in the 
foregoing detailed description, those of ordinary skill will recognize the invention is capable of 
numerous modifications, rearrangements and substitutions without departing from the scope of the 
invention as set forth in the appended claims. For example, in the case where the lignin DNA sequence 
is transcribed and translated to produce a functional syringyl lignin gene, those of ordinary skill will 
recognize that because of codon degeneracy a number of polynucleotide sequences will encode the same 
gene. These variants are intended to be covered by the DNA sequences disclosed and claimed herein. 
In addition, the sequences claimed herein include those sequences with encode a gene having substantial 
functional identity with those claimed. Thus, in the case of syringyl lignin genes, for example, the 
DNA sequences include variant polynucleotide sequences encoding polypeptides which have substantial 
identity with the amino acid sequence of syringyl lignin and which show syringyl lignin activity in 
gymnosperms. 
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What is claimed is: 

1 . A method for modifying the genome of a gymnosperm which comprises cloning one or 
more angiosperm DNA sequences which code for genes necessary for production of angiosperm 
syringyl lignin monomer units, fusing one or more of the angiosperm DNA sequences to a 
promoter region associated with a gene to form an expression cassette and inserting the 
expression cassette into the gymnosperm genome to thereby produce a modified genome in the 
gymnosperm containing genes which code for enzymes which produce syringyl lignin monomer 
units. 

2. The method of claim 1, further comprising incorporating a genetic sequence which 
codes for anti-sense mRNA into the gymnosperm genome in order to suppress formation of 
guaiacyl lignin monomer units. 

3. A gymnosperm plant containing an expression cassette produced according to the 
method of claim 1. 

4. A loblolly pine containing an expression cassette produced according to the method of 
claim 1. 

„ 5. The method of claim 1 wherein the angiosperm DNA sequences are selected from the 
class consisting of 4-coumarate CoA ligase (4CL) t bifunctional-O-methyl transferase (bi-OMT) 
andP450-l and P4 5 0-2. 

6. The method of claim 1 wherein the promoter region is selected from the class 
consisting of the 5' flanking region of phenylalanine ammonia-lyase (PAL) and the 5' flanking 
region of 4-coumarate CoA ligase (4CL1B and 4CL3B). 

7. The method of claim 1 wherein the expression cassette is inserted into the 
gymnosperm genome by way of the transformation vector Agrobacterium. 

8. The method of claim 7 wherein the Agrobacterium is Agrobacterium tumefaciens 

EH101. 

9. The method of claim 1 wherein the expression cassette is inserted into the 
gymnosperm genome via direct DNA delivery to a target cell. 

10. The method of claim 1 wherein the expression cassette is inserted into the 
gymnosperm genome by micro-projectile bombardment of a gymnosperm cell. 

1 1 . The method of claim 1 wherein the expression cassette is inserted into the 
gymnosperm genome by electroporation of a gymnosperm cell. 

12. The method of claim I wherein the expression cassette is inserted into the 
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gymnospcrm genome via silicon carbide whiskers. 

13. The method of claim 1 wherein the expression cassette is inserted into the 
gymnosperm genome via transformed protoplast. 

14. The method of claim 1 further comprising inserting a selectable marker into the 
expression cassette. 

15. The method of claim 14 wherein the selectable marker is selected from the group 
consisting of kanamycin and hygromycin B. 

16. The method of claim 2 wherein the anti-sense mRNA is a gymnosperm genetic 
sequence which codes for the 4-coumarate CoA ligase (4CL) gene. 

17. The method of claim 1 wherein the promoter region is a DNA sequence which 
includes the 5' flanking region of the gymnosperm loblolly pine PAL gene. 

1 8. The method of claim 1 wherein the promoter region is a DNA sequence which 
includes the 5' flanking region of the gymnosperm loblolly pine 4CL1B gene. 

19. The method of claim 1 wherein the promoter region is a DNA sequence which 
includes the 5* flanking region of the gymnosperm loblolly pine 4CL3B gene. 

20. The method of claim 1 wherein the promoter region includes a constitutive promoter. 

21. An isolated P450-1 DNA sequence which encodes an enzyme involved in the 
biosynthesis of syringyl lignin monomer units, wherein said DNA is as shown in SEQ ID. No. 1 
and 2. 

22. An isolated P450-2 DNA sequence which encodes an enzyme involved in the 
biosynthesis of syringyl lignin monomer units, wherein said DNA is as shown in SEQ ID. No. 3 
and 4. 

23. An isolated bi-OMT DNA sequence which encodes an enzyme involved in the 
biosynthesis of syringyl lignin monomer units, wherein said DNA is as shown in SEQ ID No. 5 
and 6. 

24. An isolated 4CL DNA sequence which encodes an enzyme involved in the 
biosynthesis of syringyl lignin monomer units, wherein said DNA is as shown in SEQ ID No. 7 
and 8. 

25. An isolated DNA, wherein said DNA encodes for an enzyme involved in the 
biosynthesis of one or more syringyl lignin monomer units. 

26. An isolated DNA sequence which includes the 5' flanking region of the gymnosperm 
loblolly pine PAL gene, containing the lignin promoter region and regulatory elements for 



-20- 



WO 99/31243 



PCT/US98/26784 



gymnosperm lignin biosynthesis as shown in SEQ ID No. 9. 

27. An isolated DNA sequence which includes the 5' flanking region of the gymnosperm 
loblolly pine 4CL1 B. containing the lignin promoter region and regulatory elements for 
gymnosperm lignin biosynthesis as shown in SEQ ID No. 10. 

28. An isolated DNA sequence which includes the 5' flanking region of gymnosperm 
loblolly pine 4CL3B. containing the lignin promoter region and regulatory elements for 
gymnosperm lignin biosynthesis as shown in SEQ ID No. 1 1 . 

, 29. An isolated DNA, wherein said DNA includes the promoter region of a gymnosperm 
gene involved in lignin biosynthesis. 

30. A method for modifying the genome of loblolly pine which comprises cloning one or 
more angiosperm DNA sequences which code for enzymes necessary for production of syringyl 
lignin monomer units, fusing one or more of the angiosperm DNA sequences to a promoter 
region to form an expression cassette, and inserting the expression cassette into the loblolly pine 
5 genome to thereby produce a modified genome in the loblolly pine containing genes which code 
for enzymes which produce syringyl lignin monomer units. 

? 3 1 . The method of claim 30 wherein the promoter region is a constitutive promoter. 

32. A loblolly pine containing an expression cassette produced according to claim 30. 

33. The method of claim 30 wherein the angiosperm DNA sequence is selected from the 
10 class consisting of 4-coumarate CoA ligase (4CL), bifunctional-O-methyl transferase (bi-OMT) 

andP450-l and P450-2. 

34. A loblolly pine containing one or more of the DNA sequences of claim 33. 

35. A loblolly pine containing the angiosperm DNA sequence inserted by the method of 
claim 30. 

36. A method for modifying the genome of loblolly pine which comprises cloning the 
sweetgum P450-1 gene, fusing it to a constitutive promoter to form an expression cassette, and 
inserting the expression cassette into the loblolly pine genome. 

37. A loblolly pine containing the P450-1 gene. 

38. A method for modifying the genome of loblolly pine which comprises cloning the 
sweetgum P450-2 gene, fusing it to a constitutive promoter to form an expression cassette, and 
inserting the expression cassette into the loblolly pine genome. 

39. A loblolly pine containing the P450-2 gene. 

40. A method for modifying the genome of a gymnosperm which comprises cloning the 
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sweetgum P450-1 gene, fusing it to a constitutive promoter to form an expression cassette, and 
inserting the expression cassette into the gymnosperm genome. 

41 . A method for modifying the genome of a gymnosperm which comprises cloning the 
sweetgum P450-2 gene, fusing it to a consititutive promoter to form an expression cassette, and 
inserting the expression cassette into a gymnosperm genome. 

42. A gymnosperm containing the P450-1 gene. 

43. A gymnosperm containing the P450-2 gene. 

44. A gymnosperm containing a DNA sequence selected from the class consisting of the 
P450-1 DNA sequence of SEQ ID No. 1 and 2, the P450-2 DNA sequence of SEQ ID No. 3 and 
4, the bi-OMT DNA sequence of SEQ ID No. 5 and 6, and the 4CL DNA sequences of SEQ ID 
No. 7 and 8. 

45. The gymnosperm of Claim 38, further comprising syringyl lignin. 
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SIIQID5 

<<100> s 

cggcacgagc cctacctcct ttcttggaaa aatttcccca ttcgatcaca atccgggcct GO 

caaaan aLg gga Lea aca age gaa acg aag aLg age ccg agt gaa gca 100 
Met Gly Ser Thr Scr Glu Thr Lys Mot Scr Pro Ser Glu ALa 

1 5 10 

gca gca gca gaa gaa gaa gca Lie gta Ltc get: aLg caa Lta acc agL 15G 

AJLa Ala Ala Glu Glu Glu Ala Phe Val Phe Ala Met Gin Leu Thr Ser 
15 20 25 30 

get tea gtt ctt ccc atg gtc eta aaa tea gee ata gag etc gac gtc 204 

Ala Ser Val Leu Pro Met Val Leu Lys Ser Ala lie Glu Leu Asp Val 
35 40 45 

tta gaa ate atg get aaa get ggt cca ggt gcg cac ata tec aca tct 252 

Leu Glu lie Met Ala Lys Ala Gly Pro Gly Ala Mis He Ser Thr Ser 
50 55 60 

gac ata gec tct aag ctg ccc aca aag aat cca gat gca gec gtc atg 300 

Asp He Ala Ser Lys Leu Pro Thr Lys Asn Pro Asp Ala Ala Val Met 
G5 70 75 

ctt gac cgt atg etc cgc etc ttg get age tac tct gtt eta acg tgc 34G 

Leu Asp Arg Met Leu Arg Leu Leu Ala Ser Tyr Ser Val Leu Thr Cys 
00 05 90 

tct etc cgc .acc etc cct gac ggc aag ate gag agg ctt tac ggc ctt 396 

Ser Leu Arg Thr Leu Pro Asp Gly Lys He Glu Arg Leu Tyr Gly Leu 
95 100 105 110 

gca ccc gtt tgt aaa ttc ttg acc aga aac gat gat gga gtc tec ata 444 

Ala Pro Val Cys Lys Phe Leu Thr Arg Asn Asp Asp Gly Val Ser He 

115 120 125 

gec get ctg tct etc atg aat caa gac aag gtc etc atg gag age tgg 492 

Ala Ala Leu Ser Leu Met Asn Gin Asp Lys Val Leu Met Glu Ser Trp 

130 135 140 

tac cac ttg acc gag gca gtt ctt gaa ggt gga att cca ttt aac aag 540 

Tyr His Leu Thr Glu Ala Val Leu Glu Gly Gly He Pro Phe Asn Lys 

145 150 1<J5 
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SEQ ID 5 

gee i:<U gga atg aca qca ttt gag tac cat ggc acc gnt ccc aga Lie 5UU 
Ala Tyr Gly Met Thr Ala Pho Glu Tyr Mis Gly Tlir Asp Pro Arg Phe 
ICO ICS 170 

aac aca gtt tie aac oat gga atg tec aat cat teg acc att acc atg 636 
Asn Thr Vol Phe Asn Asn Gly Met Scr Asn Mis Scr Thr lie Thr Met 
175 100 105 190 

..inq aaa ate ctt gag act tac aaa ggg ttc gag gga ctt gga tct gtg 60<1 
Lys Lys lie Leu Glu Thr Tyr Lys Gly Phe Glu Gly Leu Gly Ser Val 
195 200 205 

gtt gat gtt ggt ggt ggc act ggt gec cac ctt aac atg att ate get 732 
Val Asp Val Gly Gly Gly Thr Gly Ala His Leu Asn Met lie lie Ala 
210 215 220 

aaa tac ccc atg ate aag ggc att aac ttc gac ttg cct cat gtt att 700 
Lys Tyr Pro Met lie Lys Gly lie Asn Phe Asp Leu Pro His Val He 
225 230 235 

gag gag get ccc tec tat cct ggt gtg gag cat gtt ggt gga gat atg 020 
Glu Glu Ala Pro Ser Tyr Pro Gly Val Glu His Val Gly Gly Asp Met 
240 245 250 

ttt gtt agt gtt cca aaa gga gat gee att ttc atg aag tgg ata tgt 076 
The Val Ser Val Pro Lys Gly Asp Ala He Phe Met Lys Trp He Cys 
255 260 265 270 

cat gat tgg age gat gaa cac tgc ttg aag ttt ttg aag aaa tgt tat 924 
His Asp Trp Ser Asp Glu His Cys Leu Lys Phe Leu Lys Lys Cys Tyr 
275 200 285 

gaa gca ctt cca acc aat ggg aag gtg ate ctt get gaa tgc ate etc 972 
Glu Ala Leu Pro Thr Asn Gly Lys Val He Leu Ala Glu Cys He Leu 
290 295 300 

ccc gtg gcg cca gac gca age etc ccc act aag gca gtg gtc cat att 1020 
Pro Val Ala Pro Asp Ala Ser Leu Pro Thr Lys Ala Val Val His He 
305 310 315 

gat gtc ate atg ttg get cat aac cca ggt ggg aaa gag aga act gag 1060 
Asp Val lie Met Leu Ala His Asn Pro Gly Gly Lys Glu Arg Thr Glu 
320 325 330 

aag gag ttt gag gee ttg gee aag ggg get gga ttt gaa ggt ttc cga 1116 
Lys Glu Phe Glu Ala Leu Ala Lys Gly Ala Gly Phe Glu Gly Phe Arg 

Fig, 2D 
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SHQ ID 5 

330 340 345 350 

gLa gLa gcc Leg tgc get tac aat_ aca Lgg ale ate gad LLt ttg aag 1164 
Val Val Ala Scr Cys Ala Tyr Asn Thr Trp lie lie Glu Phe Leu Lys 
355 360 365 

aag aU LgagLccLLa clicggctiltcj aglacataat accaactccl tttggttttc 1220 
Lys He 

gagaLtgtga I Lg Lga ULgt gaLtglclict cUllcgcagt IggcctLalg atatiaa tgta 1200 

tcgUaactc gaUcacagaa gtgeaaaaga cagtigaatgL acacUgcttt ataaaalaaa 1340 

aattlitaaqa LLLtgattca LgLaaaaaaa aaaaaaaaaa 1300 



Fig- 2C 



SUBSTITUTE SHEET (RULE 26) 



WO 99/31243 



5 / 32 



PCT/US98/26784 



SEQ ID 6 

<<IOO> G 

net Gly Ser Tlir Scr Glu Thr Lys Met Ser Pro Sc* Glu Ala Ala Ala 
15 10 15 

A l a Glu Glu Glu A t .i I'hc Vol Phe Ala Met Gin Leu Tin- Scr Ala Ser 
20 25 30 

Vnl Leu Pro Met Vol Leu Lys Ser Ala lie Glu Leu Asp Val Leu Glu 
35 40 AS 

lie Met AJ.i Lys Ala Gly Pro Gly Ala Mis He Ser Thr Ser Asp He 
50 55 60 

Ala Ser Lys Leu Pro Thr Lys Asn Pro Asp Ala Ala Val Mel; Leu Asp 
" 70 75 GO 

Arg; Met Leu Arc/ Leu Leu Ala Ser Tyr Ser Val Leu Thr Cys Ser Leu 
05 90 95 

Arg Thr Leu Pro Asp Gly Lys He Glu Arg Leu Tyr Gly Leu Ala Pro 
100 105 HO 

Val Cys Lys Phe Leu Thr Arg Asn Asp Asp Gly Val Scr He Ala Ala 
115 120 125 

Leu Ser Leu Met Asn Gin Asp Lys Val Leu Met Glu Scr Trp Tyr His 
130 135 140 

Lou Thr Glu Ala Val Leu Glu Gly Gly He Pro Phe Asn Lys Ala Tyr 
145 150 155 160 

Gly Met Thr Ala Pile Glu Tyr His Gly Thr Asp Pro Arg Phe Asn Thr 
165 170 175 

Val Phe Asn Asn Gly Met Ser Asn His Ser Thr He Thr Met Lys Lys 
180 105 190 

He Leu Glu Thr Tyr Lys Gly Phe Glu Gly Leu Gly Ser Val Val Asp 
195 200 205 

Val Gly Gly Gly Thr Gly Ala His Leu Asn Met He lie Ala Lys Tyr 
210 215 220 

Pro Met He Lys Gly lie Asn Phe Asp Leu Pro His Val He Glu Glu 
225 230 235 240 



Fig. 2D 
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SIZQ ID 6 



Ala Pro oor Tyr 



Ser Vol Pro Lys 
2G0 

Trp Scr Asp Glu 
275 

Leu Pro Thr Asn 
290 

Ala Pro Asp Ala 
305 

lie Met Leu Ala 



Phe Glu Ala Leu 
340 

Ala Ser Cys Ala 
355 



Pro (Gly Val Glu 
245 

Gly Asp Ala lie 



His Cys Leu Lys 
200 

Gly Lys Val lie 
295 

Ser Leu Pro Thr 
310 

His Asn Pro Gly 
325 

Ala Lys Gly Ala 



Tyr Asn Thr Trp 
360 



His Val Gly Gly 
250 

Phe Met Lys Trp 
205 

Phe Leu Lys Lys 



Leu Ala Glu Cys 
300 

Lys Ala Val Val 
315 

Gly Lys Glu Arg 
330 

Gly Plie Glu Gly 
345 

lie He Glu Phe 



Asp Met Phe Val 
255 

He Cys His Asp 
270 

Cys Tyr Glu Ala 
205 

He Leu Pro Val 



His He Asp Val 
320 

Thr Glu Lys Glu 
335 

Plie Arg Val Vol 
350 

Leu Lys Lys He 
365 



Fig. 2E 
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SliQ ID 7 



<100> 7 

cqgcacgagc LcnUtLcca cttctggttt gatctctgca attcttccat cagtccctn 59 

atg gag acc caa aca aaa caa gaa gaa ate alia tat egg teg aaa etc 307 
Met Glu Thr Gin Thr Lys Gin Glu Glu lie He Tyr Arg Ser Lys Leu 
15 10 15 

cce gat ate tac ate ccc aaa cac etc cct tta cat teg tat tgt ttc 155 
Pro Asp He Tyr He Pro Lys His Leu Pro Leu His Ser Tyr Cys Phc 
20 25 30 

gag aac ate tea cag ttc ggc tec cgc ccc tgt ctg ate aat ggc gca 203 
Glu Asn He Ser Gin Phe Gly Ser Arg Pro Cys Leu He Asn Gly Ala 
35 40 45 

acg ggc aag tat tac aca tat get gag gtt gag etc att gcg cgc aag 251 
Thr Gly Lys Tyr Tyr Thr Tyr Ala Glu Val Glu Leu He Ala Arg Lys 
50 55 60 

gtc gca tec ggc etc aac aaa etc ggc gtt cga caa ggt gac ate ate 299 
Val Ala Ser Gly Leu Asn Lys Leu Gly Val Arg Gin Gly Asp He He 
65 70 75 00 

atg ctt ttg eta ccc aac teg ccg gag ttc gtg ttt tea att etc ggc 347 
Met Leu Leu Leu Pro Asn Ser Pro Glu Phe Val Phe Ser He Leu Gly 
05 90 95 

gca tec tac cgc ggg get gec gec acc gec gca aac ccg ttt tat acc 395 
Ala Ser Tyr Arg Gly Ala Ala Ala Thr Ala Ala Asn Pro Phe Tyr Thr 
100 105 HO 

cct gec gag ate agg aag caa gec aaa acc tec aac gec agg ctt att 443 
Pro Ala Glu He Arg Lys Gin Ala Lys Thr Ser Asn Ala Arg Leu He 
115 120 125 

ate aca cat gec tgt tac tat gag aaa gtg aag gac ttg gtg gaa gag 491 
He Thr His Ala Cys Tyr Tyr Glu Lys Val Lys Asp Leu Val Glu Glu 
130 135 140 

aac gtt gec aag ate ata tgt ata gac tea ccc ccg gac ggt tgt ttg 539 
Asn Val Ala Lys He He Cys He Asp Ser Pro Pro Asp Gly Cys Leu 
145 150 155 160 



Fig. 3A 
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SliQ ID 7 

cac ttc teg g..g cLg age gag gcg gac gag aoc gac atg ccc aot gta 007 

Mis Phe Ser Glu Leu Scr Glu Ala Asp Glu Asn Asp Met Pro Asn Val 
165 170 175 

gag all; gac ccc gat gat gtg gtg gcg ctg ccg tac Leg Lea ggg acg 635 
Glu lie Asp Pro Asp Asp Val Val Ala Leu Pro Tyr Scr Ser Gly Thr 
100 185 190 

ac 9 99^ tta cca aa 9 999 9^9 at 9 cta aca cac aa 9 99 a caa 9tg a cg 603 
Thr Gly Leu Pro Lys Gly Val Met Leu Thr His Lys Gly Gin Val Thr 
195 200 205 

agL gLg gcg caa cag gtg gac gga gag aat ccg aac ctg tat ata cat 731 
Ser Val Ala Gin Gin Val Asp Gly Glu Asn Pro Asn Leu Tyr lie His 
210 215 220 

age gag gac gtg gtt ctg tgc gtg ttg cct ctg ttt cac ate tac teg 779 
Ser Glu Asp Val Val Leu Cys Val Leu Pro Leu Phe His lie Tyr Ser 
225 230 235 240 

atg aac gtc atg ttt tgc ggg tta cga gtt ggt gcg gcg att ctg att 027 
Met Asn Val Met Phe Cys Gly Leu Arg Val Gly Ala Ala lie Leu He 
245 250 255 

atg cag aaa ttt gaa ata tat ggg ttg tta gag ctg gtc aga agt aca 075 
Met Gin Lys Phe Glu He Tyr Gly Leu Leu Glu Leu Val Arg Ser Thr 
260 265 270 



923 



ggt gac cat cat gec tat cgt aca ccc ate gta ttg gca ate tec aag 
Giy Asp His His Ala Tyr Arg Thr Pro He Val Leu Ala lie Ser Lys 
275 200 285 

act ccg gat ctt cac aac tat gat gtg tec tec att egg act gtc atg 971 
Thr Pro Asp Leu His Asn Tyr Asp Val Ser Ser lie Arg Thr Val Met 
290 295 300 

tea ggt gcg get cct ctg ggc aag gaa ctt gaa gat Let gtc aga get 1019 
Ser Gly Ala Ala Pro Leu Gly Lys Glu Leu Glu Asp Ser Val Arg Ala 
305 310 315 320 

aag ttt ccc acc gee aaa ctt ggt cag gga tat gga atg acg gag gca 1067 
Lys Phe Pro Thr Ala Lys Leu Gly Gin Gly Tyr Gly Met Thr Glu Ala 
325 330 335 

ggg ccc gtg eta gcg atg tgt ttg gca ttt gec aag gaa ggg ttt gaa 1115 
Gly Pro Val Leu Ala Met Cys Leu Ala Phe Ala Lys Glu Gly Phe Glu 
340 345 350 

Fig. 3D 
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SIIQ ID 7 

ata aja Leg ggg gca Let gga act gtt tta agg -iac gca cag atg aag 11G3 
lie Lys Sol- Gly Ala Scr Gly Thr Val Leu Arg Asn Ala Gin Me I Lys 
355 3G0 3G5 

atl gig gac cct gaa acc ggL gtc act etc cct cga aac caa ccc gga 1211 
lie Val Asp Pro Glu Thr Gly Val Thr Leu Pro Arg Asn Gin Pro Gly 
370 375 300 

gag att tgc att aga gga gac caa ale atg aaa ggt tat ctt aat gat 1259 
Glu lie Cys lie Arg Gly Asp Gin lie Met Lys Gly Tyr Leu Asn Asp 
305 390 395 400 

cct gag gcg a eg gag aga acc ata gac aag gaa ggt tgg tta cac aca 1307 
Pro Glu Ala Thr Glu Arg Thr He Asp Lys Glu Gly Trp Leu ills Thr 
405 4 10 415 

ggt gat gtg ggc tac ate gac gat gac act gag etc ttc att gtt gat 1355 
Gly Asp Val Gly Tyr He Asp Asp Asp Thr Glu Leu Phe He Val Asp 
420 425 430 

egg ttg aag gaa ctg ate aaa tac aaa ggg ttt cag gtg gca ccc get 1403 
Arg Leu Lys Glu Leu He Lys Tyr Lys Gly Phe Gin Val Ala Pro Ala 
435 440 445 

gag ctt gag gec atg etc etc aac cat ccc aac ate tct gat get gec 1451 
Glu Leu Glu Ala Met Leu Leu Asn His Pro Asn He Ser Asp Ala Ala 
450 455 460 

gtc gtc cca atg aaa gac gat gaa get gga gag etc cct gtg gcg ttt 1499 
Val Val Pro Met Lys Asp Asp Glu Ala Gly Glu Leu Pro Val Ala Phe 
465 470 475 400 

gtt gta aga tea gat ggt tct cag ata tec gag get gaa ate agg caa 1547 
Val Val Arg Ser Asp Gly Ser Gin He Ser Glu Ala Glu He Arg Gin 
485 490 495 

tac ate gca aaa cag gtg gtt ttt tat aaa aga ata cat cgc gta ttt 1595 
Tyr He Ala Lys Gin Val Val Phe Tyr Lys Arg He His Arg Val Phe 
500 505 510 

ttc gtc gaa gec att cct aaa gcg ccc tct ggc aaa ate ttg egg aag 1G43 
Phe Val Glu Ala He Pro Lys Ala Pro Ser Gly Lys He Leu Arg Lys 
515 520 525 

gac ctg aga gee aaa ttg gcg tct ggt ctt ccc aat taattctcat 1609 
Asp Leu Arg Ala Lys Leu Ala Scr Gly Leu Pro Asn 
530 535 540 

Fig. 3C 
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LcgclacccL ccLlitctctt alcalacgcc 
gctcattcqa agcggctcao tlaaagctgc 
cttgttggga tgttctttca tttgattcag 
gtgaaattca caagaatglc tgtaaatcga 
tgacattgtt tacgttgtat ttcctgctgt 
tgggaagata acctttcaaa aaaaaaaaaa 



SIZQ ID 7 

aacacgaacg aagaggctca attaaacgct 1"M9 

tcatlcatgt ccaccgagtg ggcagcctyt 1009 

ctgtgagaag ccagaccctc at tat t t at t 1069 

tgttgtgagt gatgggtttc aaaacacttt 1929 

tgaaaataac tactttgtat gacttttatt 1909 

aaaaaa 2025 



ng. 3D 
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SISQ ID 8 

<400> 0 

Met Glu Thr Gin Thr Lys Gin Glu Glu He lie Tyr Arg Ser Lys Leu 
15 10 15 

Pro Asp He Tyr He Pro Lys His Leu Pro Leu His Ser Tyr Cys Phe 
20 25 30 

Glu Asn He Ser Gin Phe Gly Ser Arg Pro Cys Leu J le Asn Gly Ala 
35 40 45 

Thr Gly Lys Tyr Tyr Thr Tyr Ala Glu Val Glu Leu He Ala Arg Lys 
S° 55 60 

Val Ala Ser Gly Leu Asn Lys Leu Gly Val Arg Gin Gly Asp lie He 
65 70 75 oo 

Met Leu Leu Leu Pro Asn Ser Pro Glu Phe Val Phe Ser He Leu Gly 
05 90 95 

Ala Ser Tyr Arg Gly Ala Ala Ala Thr Ala Ala Asn Pro Phe Tyr Thr 
100 105 no 

Pro Ala Glu He Arg Lys Gin Ala Lys Thr Ser Asn Ala Arg Leu He 
H5 120 125 

He Thr His Ala Cys Tyr Tyr Glu Lys Val Lys Asp Leu Val Glu Glu 
130 135 140 

Asn Val Ala Lys He He Cys He Asp Ser Pro Pro Asp Gly Cys Leu 
145 1^0 155 igo 

His Phe Ser Glu Leu Ser Glu Ala Asp Glu Asn Asp Met Pro Asn Val 
165 170 175 

Glu lie Asp Pre Asp Asp Val Val Ala Leu Pro Tyr Ser Ser Gly Thr 
100 105 190 

Thr Gly Leu Pro Lys Gly Val Met Leu Thr His Lys Gly Gin Val Thr 
195 200 205 

Ser Val Ala Gin Gin Val Asp Gly Glu Asn Pro Asn Leu Tyr He His 
210 215 220 

Ser Glu Asp Val Val Leu Cys Val Leu Pro Leu Phe His He Tyr Ser 
225 230 235 240 



Fig. 311 
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S12Q ID 8 

Met Asn Vol McL Phe Cys Gly Leu Arg Val Gly Ala Ala lie Leu lie 
245 250 255 

Met Gin Lys Phe Glu Uc Tyr GJ y Leu Leu Glu Leu Val Arg Scr Thr 
2G0 265 270 

Gly Asp ilis His Ala Tyr Arg Thr Pro lie Val Leu Ala lie Ser Lys 
275 200 285 

Thr Pro Asp Leu His Asn Tyr Asp Val Ser Ser He Arg Thr Val Met 
290 295 300 

Ser Gly Ala Ala Pro Leu Gly Lys Glu Leu Glu Asp Ser Val Arg Ala 
305 310 315 320 

Lys Phe Pro Thr Ala Lys Leu Gly Gin Gly Tyr Gly Met Thr Glu Ala 
325 330 335 

Gly Pro Val Leu Ala Met Cys Leu Ala Phe Ala Lys Glu Gly Phe Glu 
340 345 350 

He Lys Ser Gly Ala Ser Gly Thr Val Leu Arg Asn Ala Gin Met Lys 
355 360 365 

He Val Asp Pro Glu Thr Gly Val Thr Leu Pro Arg Asn Gin Pro Gly 
370 375 300 

Glu He Cys He Arg Gly Asp Gin He Met Lys Gly Tyr Leu Asn Asp 
305 390 395 400 

Pro Glu Ala Thr Glu Arg Thr lie Asp Lys Glu Gly Trp Leu His Thr 
405 410 415 

Gly Asp Val Gly Tyr He Asp Asp Asp Thr Glu Leu Phe He Val Asp 
420 425 430 

Arg Leu Lys Glu Leu He Lys Tyr Lys Gly Phe Gin Val Ala Pro Ala 
435 440 445 

Glu Leu Glu Ala Met Leu Leu Asn His Pro Asn He Ser Asp Ala Ala 
450 455 460 

Val Val Pro Met Lys Asp Asp Glu Ala Gly Glu Leu Pro Val Ala Phe 
465 470 475 400 



Fig.3F 
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SL-Q ID 8 



Val Vol Arg Sci Asp Cly Scr Gin lie Scr Glu Ala Clu lie Arg Gin 



405 



4 90 



495 



Tyr lie Ala Lys Gin Val Val Phe Tyr Lys Arg He ills Arg Val Phc 
500 ^05 510 



Plie Val Glu Ala He Pro Lys . La Pro Ser CJ. 



515 



.20 



y Lys He Leu Arg Lys 



525 



Asp Leu Arg Ala Lys Leu Ala Ser. Gly Leu Pro Asn 
530 535 540 



Fig. 3G 
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SliQ ID I 



56 



104 



cggcacgagg aaacccCaoa actcacctct cttacccttt ctcttca atg get Ltc 

Met Ala Phe 
1 

ctt eta ata ccc ate Uca ala ate U:c ate gtc tta get tac cag etc 

Leu Leu lie Pro lie Scr lie lie Phe lie Val Leu Ala Tyr Gin Leu 
5 10 15 

tat caa egg etc aga ttt aag etc cca ccc ggc cca cgt cca tgg ccg 152 
Tyr Gin Arg Leu Arg Phe Lys Leu Pro Pro Gly Pro Arg Pro Trp Pro 
20 25 30 35 



200 



ate gtc gga aac ctt tac gac ata aaa ccg gtg agg ttc egg tgt ttc 
lie Val Gly Asn Leu Tyr Asp lie Lys Pro Val Arg Phe Arg Cys Phe 
4 0 '15 50 

gec gag tgg tea caa gcg tac ggt ccg ate ata teg gtg tgg ttc ggt 240 
Ala Glu Trp Scr Gin Ala Tyr Gly Pro lie He Ser Val Trp Phe Gly 

05 60 G5 

tea acg ttg aat gtg ate gta teg aat teg gaa ttg get aag gaa gtg 29G 
Ser Thr Leu Asn Val He Val Scr Asn Ser Glu Leu Ala Lys Glu Val 
70 75 00 

etc aag gaa aaa gat caa caa ttg get gat agg cat agg agt aga tea 341 
Leu Lys Glu Lys Asp Gin Gin Leu Ala Asp Arg His Arg Ser Arg Ser 
05 90 95 

get gec aaa ttt age agg gat ggg cag gac ctt ata tgg get gat tat 392 
Ala Ala Lys Phe Ser Arg Asp Gly Gin Asp Leu He Trp Ala Asp Tyr 
100 105 110 115 

gga cct cac tat gtg aag gtt aca aag gtt tgt ace etc gag ctt ttt 440 
Gly Pro Mis Tyr Val Lys Val Thr Lys Val Cys Thr Leu Glu Leu Phe 
120 125 130 

act cca aag egg ctt gaa get ctt aga ccc att aga gaa gat gaa gtt 400 
Thr Pro Lys Arg Leu Glu Ala Leu Arg Pro He Arg Glu Asp Glu Val 
135 140 145 

aca gec atg gtt gag tec att ttt aat gac act gcg aat cct gaa aat 536 
Thr Ala Met Val Glu Ser He Phe Asn Asp Thr Ala Asn Pro Glu Asn 
150 155 160 

Fig. 4 A 
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tat ggg aag agt alg ctg gtg aag aag tat ttg gga gca gta gca ttc 504 

Tyr Gly Lys Set* Met Leu Val Lys Lys Tyr Leu Gly Ai a Val Ala Phe 

105 170 1713 

aac aac a L L aca agn etc gca Lit gga aag eg a LLc gtg aat Lea gag G32 

Asn Asn He The Arg Leu Ala Phe Gly Lys Arg Phe Val Asn Ger Glu 

100 105 190 195 

ggt gta atg gac gag caa gga ctt gaa tut aag gaa att gtg gec aat 600 

Gly Val Mot Asp Glu Gin Gly Leu Glu Phe Lys Glu lie Val Ala Asn 
200 205 210 

gga etc aag ctt ggt gec tea ctt gca atg get gag cac att cct tgg 720 

Gly Leu Lys Leu Gly Ala Scr Leu Ala Met Ala Glu His He Pro Trp 
215 220 225 

etc cgt tgg atg ttc cca ctt gag gaa ggg gec ttt gee aag cat ggg 776 

Leu Arg Trp Met Phe Pro Leu Glu Glu Gly Ala Phe Ala Lys His Gly 
230 235 240 

gca cgt agg gac cga ctt acc aga get ate atg gaa gag cac aca ata 024 

Ala Arg Arg Asp Arg Leu Thr Arg Ala He Met Glu Glu His Thr lie 

245 250 255 

gee cgt aaa aag agt ggt gga gec caa caa cat ttc gtg gat gca ttg 072 

Ala Arg Lys Lys Ser Gly Gly Ala Gin Gin His Phe Val Asp Ala Leu 

260 265 270 275 

etc acc eta caa gag aaa tat gac ctt age gag gac act att att ggg 920 

Leu Thr Leu Gin Glu Lys Tyr Asp Leu Ser Glu Asp Thr He He Gly 
200 205 290 

etc ctt tgg gat atg ate act gca ggc atg gac aca acc gca ate tct 9C0 

Leu Leu Trp Asp Met lie Thr Ala Gly Met Asp Thr Thr Ala lie Ser 
295 300 305 

gtc gaa tgg gec atg gec gag tta att aag aac cca agg gtg caa caa 1016 

Val Glu Trp Ala Met Ala Glu Leu He Lys Asn Pro Arg Val Gin Gin 
310 315 320 

aaa get caa gag gag eta gac aat gta ctt ggg tec gaa cgt gtc ctg 106<J 

Lys Ala Gin Glu Glu Leu Asp Asn Val Leu Gly Ser Glu Arg Val Leu 

325 330 335 



Fig. 413 
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acc gaa Ltg gac Lie Lea 
Thr Glu Leu Asp Phe Ser 
3'I0 345 

gag gca eta agg ctg cac 

Glu Ala Lou Arg Leu His 
360 

gec aaL gec aac gLc aaa 

Ala Asn Ala Asn Val Lys 
375 

aaL git cat gLa aat gLc 

Asn Val Mis Val Asn Val 
390 

cgL gac cca cLa gag Ltt 

Arg Asp Pro Leu Glu Phe 
405 

gac atg aaa ggt cac gat 

Asp Met Lys Gly Mis Asp 

420 425 

cgt gtt tgc ccc ggl gca 
Arg Val Cys Pro Gly Ala 

440 

alg ggt cac eta ttg cac 
Met Gly His Leu Leu His 
455 

aaa cca gag gag att gac 
Lys Pro Glu Glu lie Asp 
470 

atg cga acc ccg gtg caa 
Met Arg Thr Pro Val Gin 
405 

ttg tac aaa cgt gta get 
Leu Tyr Lys Arg Val Ala 
500 505 



S12Q ID 1 

aye etc cct tat cla caa 
Ser Leu Pro Tyr Leu Gin 
350 

cct cca aca cca cla aLg 
Pro Pro Thr Pro Leu Mel 
3G5 

alt ggt ggc Lac gac ate 
lie Gly Gly Tyr Asp lie 
300 

tgg gec gig gel cgt gat 
Trp Ala Val Ala Arg Asp 

395 

cga ccg gaa egg tic tct 
Arg Pro Glu Arg Phe Ser 
410 415 

tat agg cla ctg ccg ttt 

Tyr Arg Leu Leu Pro Phe 
430 

caa ctt ggc ale aat ttg 

Gin Leu Gly lie Asn Leu 

445 

cat ttc tat tgg age cct 
His Phe Tyr Trp Ser Pro 
460 

atg tea gag aat cca gga 
Met Ser Glu Asn Pro Gly 
475 

get gtt ccc act cca agg 
Ala Val Pro Thr Pro Arg 
490 495 

gtg gat atg taattcttag 
Val Asp Met 



tgt gta gee aag 1112 
Cys Val Ala Lys 
355 

etc cct cat cgc 1160 
Leu Pro His Arg 
370 

cct aag gga tea 1200 
Pro Lys Gly Ser 
305 

cca gca gtg tgg 1256 

Pro Ala Val Trp 

400 

gaa gac gat gtc 1304 
Glu Asp Asp Val 



ggt gca ggg agg 1352 
Gly Ala Gly Arg 
435 

gtc aca tec atg 1400 
Val Thr Ser Met 

450 

cct aaa ggt gta 1440 
Pro Lys Gly Val 
465 

ttg gtc acc tac 1496 

Leu Val Thr Tyr 

400 

ctg cct get cac 1544 
Leu Pro Ala His 



tttgttatta 1591 



Fig.4C 
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SISQ ID I 

L tcaLgctcU LaaggLtLtg gactlLgaac ttatgatgag at: I: Lglaaaa ttccaagtga 1651 
tcaaatgaag aaaagaccan ataaaaaygc ttgacgallt aaaaaaaaaa aaaaaaa 1700 



Fig. AD 
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Met Ala Phe Leu 
1 

Tyr Gin Leu Tyi 
20 

Pro Trp Pro lie 

35 

Arg Cys Phe Aia 
50 

Trp Phe Gly Ser 
G5 

Lys Glu Val Leu 



Ser Arg Ser Ala 
100 

Ala Asp Tyr Gly 
115 

Glu Leu Phe Thr 
130 

Asp Glu Val Thr 
145 

Pro Glu Asn Tyr 



Val Ala Phe Asn 
180 

Asn Ser Glu Gly 
195 

Val Ala Asn Gly 
210 

lie Pro Trp Leu 
225 



Leu lie Pro lie 
5 

Gin Arg Leu Arg 



Val Gly Asn Leu 
40 

Glu Trp Ser Gin 
55 

Thr Leu Asn Val 
70 

Lys Glu Lys Asp 
05 

Ala Lys Phe Ser 



Pro Mis Tyr Val 
120 

Pro Lys Arg Leu 
135 

Ala Met Vol Glu 
150 

Gly Lys Ser MeL 
165 

Asn lie Thr Arg 



Val Met Asp Glu 
200 

Leu Lys Leu Gly 
215 

Arg Trp Met Phe 
230 



SIIQ ID 2 

Ser He He Phe 
10 

Phe Lys Leu Pro 
25 

Tyr Asp He Lys 



Ala Tyr Gly Pro 
GO 

He Val Ser Asn 
75 

Gin Gin Leu Ala 
90 

Arg Asp Gly Gin 
105 

Lys Val Thr Lys 



Glu Ala Leu Arg 
140 

Ser He Phe Asn 
155 

Leu Val Lys Lys 
170 

Leu Ala Phe Gly 
165 

Gin Gly Leu Glu 



Ala Ser Leu Ala 
220 

Pro Leu Glu Glu 
235 



I I e Val Leu Ala 
15 

L'ro Gly Pro Arg 
30 

Pro Val Arg Phe 
45 

He He Ser Val 



Ser Glu Leu Ala 
00 

Asp Arg ilis Arg 
95 

Asp Leu He Trp 
110 

Val Cys Thr Leu 
125 

Pro He Arg Glu 



Asp Thr Ala Asn 
160 

Tyr Leu Gly Ala 
175 

Lys Arg Phe Val 
190 

Phe Lys Glu lie 
205 

Met Ala Glu His 



Gly Ala Phe Ala 
240 



Fig.4E 
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S1IQ ID 2 

Lys His Gly Ala Arg Arg Asp Arg Leu Thr Arg Ala lie Met Glu Glu 
215 250 255 

His Thr lie Ala Arg Lys Lys Ser Gly Gly Ala Gin Gin His Hie Vol 
2G0 265 270 

Asp Ala Leu Leu Thr Leu Gin Glu Lys Tyr Asp Leu Scr Glu Asp Thr 
275 200 205 

lie lie Gly Leu Leu Trp Asp Mel He Thr Ala Gly Met Asp Thr Thr 
290 295 300 

Ala He Ser Val Glu Trp Ala Met Ala Glu Leu He Lys Asn Pro Arg 
305 310 315 320 

Val Gin Gin Lys Ala Gin Glu Glu Leu Asp Asn Val Leu Gly Ser Glu 
325 330 335 

Arg Val Leu Thr Glu Leu Asp Phe Ser Scr Leu Pro Tyr Leu Gin Cys 
340 345 350 

Val Ala Lys Glu Ala Leu Arg Leu His Pro Pro Thr Pro Leu Met Leu 
355 360 365 

Pro his Arg Ala Asn Ala Asn Val Lys He Gly Gly Tyr Asp He Pro 
370 375 300 

Lys Gly Ser Asn Val His Val Asn Val Trp Ala Val Ala Arg Asp Pro 
305 390 395 400 

Ala Val Trp Arg Asp Pro Leu Glu Phe Arg Pro Glu Arg Phe Ser Glu 
405 410 415 

Asp Asp Val Asp Met Lys Gly His Asp Tyr Arg Leu Leu Pro Phe Gly 
420 425 430 

Ala Gly Arg Arg Val Cys Pro Gly Ala Gin Leu Gly He Asn Leu Val 
435 440 445 

Thr Ser Met Met Gly His Leu Leu His His Phe Tyr Trp Ser Pro Pro 
450 455 460 

Lys Gly Val Lys Pro Glu Glu He Asp Mel Scr Glu Asn Pro Gly Leu 
465 470 475 400 



Fig.4F 
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SHQ ID 2 

Val Thr Tyr Mel: Arg Thr Pro Val Glu Ala Val Pro Thr Pro Arg Leu 
AQ'j 490 495 

Pro Ala His Leu Tyr I.ys Arg Vol Ala Val Asp Mel 
500 505 



Fig. 4G 
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SIZQID3 

<400> 3 

tgcaaacctg cacaaacaaa gagagagang aagaaaaagg aagagaggag agagagagag 60 

agagagayaa gee atg gat tct let clt cat gaa gec ttg caa cca eta ] 0 0 
Met Asp Ser Ser Lou His Glu Ala Leu Gin Pro Leu 
1 S 10 

ccc atg acq cty tic ttc att ata cct tig eta etc tta ttg ggc eta 157 
Pro Met Thr Leu Phe Phe lie lie Pro Leu Leu Leu Leu Leu Gly Leu 
15 20 25 

gta let egg clt cgc cag aga eta cca tac cca cca ygc cca aaa ggc 205 
Val Ser Arg Leu Arg Gin Arg Leu Pro Tyr Pro Pro Gly Pro Lys Gly 
30 35 40 



tta ccg gtg ate gga aac atg etc atg atg gat caa etc act cac cga 

Leu Pro Val lie Gly Asn Met Leu Met Met Asp Gin Leu Thr His Arg 

45 SO 55 60 

gga etc gee aaa etc gec aaa caa tac gc ggt eta ttc cac etc aag 

Gly Leu Ala Lys Leu Ala Lys Gin Tyr G : ;, Gly Leu Phe His Leu Lys 

65 70 75 



253 



301 



atg gga ttc tta cac atg gtg gee gtt tec aca ccc gac atg get cgc 349 
Met Gly Phe Leu His Met Val Ala Val Ser Thr Pro Asp Met Ala Arg 
00 05 90 

caa gtc ctt caa gtc caa gac aac ate ttc teg aac egg cca gec ace 397 
Gin Val Leu Gin Val Gin Asp Asn lie Phe Ser Asn Arg Pro Ala Thr 
95 100 105 

ata gec ate age tac etc acc tat gac cga gec gac atg gec ttc get 4 45 
He Ala He Ser Tyr Leu Thr Tyr Asp Arg Ala Asp Met Ala Phe Ala 
110 115 120 

cac tac ggc ccg ttt tgg cgt cag atg cgt aaa etc tgc gtc atg aaa 493 
His Tyr Gly Pro Phe Trp Arg Gin Met Arg Lys Leu Cys Val Met Lys 
125 130 135 140 

tta ttt age egg aaa cga gec gag teg tgg gag teg gtc cga gac gag 541 
Leu Phe Ser Arg Lys Arg Ala Glu Ser Trp Glu Ser Val Arg Asp Glu 
145 150 155 

gtc gac teg gca gta cga gtg gtc gcg tec aat att ggg teg acg gtg 509 
Val Asp Ser Ala Val Arg Val Val Ala Ser Asn He Gly Ser Thr Val 
160 165 170 

Fig. 5A 
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SL-Q ID 3 

aat ate ggc gag ctg gtt ttt gel ctg acg aag aat at.L act tac agg G37 
Asn lie Gly Glu Leu Vol Phe Ala Leu Thr Lys Asn He Thr Tyr Arg 
175 180 105 

gcg get ttt ggg acg ate teg cat gag gac cag gac gag ttc gtg gee 005 
Ala Ala The Gly Thr lie Ser His Glu Asp Gin Asp Glu Phe Val Ala 
190 195 200 

ata ctg caa gag ttt teg cag ctg ttt ggt get ttt aal: ata get gaL 733 
He Leu Gin Glu Phe Ser Gin Leu Phe Gly Ala The Asn He Ala Asp 

205 210 215 220 

ttt ate cct tgg etc aaa tgg gtt cct cag ggg att aac gtc agg etc 701 
Phe He Pro Trp Leu Lys Trp Val Pro Gin Gly He Asn Val Arg Leu 
225 230 235 

aac aag gca cga ggg gcg ctt gat ggg ttt att gac aag ate ate gac 029 
Asn Lys Ala Arg Gly Ala Leu Asp Gly Phc He Asp Lys He He Asp 
2<10 245 250 

gat cat ata cag aag ggg agt aaa aac teg gag gag gtt gat act gat 077 
Asp His He Gin Lys Gly Ser Lys Asn Ser Glu Glu Val Asp Thr Asp 
255 260 265 

atg gta gat gat tta ctt get ttt tac ggt gag gaa gec aaa gta age 925 
Met Val Asp Asp Leu Leu Ala Phe Tyr Gly Glu Glu Ala Lys Val Ser 
270 275 200 

gaa tct gac gat ctt caa aat tec ate aaa etc acc aaa gac aac ate 973 
Glu Ser Asp Asp Leu Gin Asn Ser He Lys Leu Thr Lys Asp Asn He 
205 290 295 300 

aaa get ate atg gac gta atg ttt gga ggg acc gaa acg gtg gcg tec 1021 
Lys Ala lie Met Asp Val Met Phe Gly Gly Thr Glu Thr Val Ala Ser 
305 310 315 

gcg att gaa tgg gec atg acg gag ctg atg aaa age cca gaa gat eta 1069 
Ala He Glu Trp Ala Mot Thr Glu Leu Met Lys Ser Pro Glu Asp Leu 
320 325 330 

aag aag gtc caa caa gaa etc gec gtg gtg gtg ggt ctt gac egg cga 1117 
Lys Lys Val Gin Gin Glu Leu Ala Val Val Val Gly Leu Asp Arg Arg 
335 340 3<15 



Fig. 5B 
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SHQ ID 3 

gtc gaa gag aaa gac tic gag aacj etc acc lac Ltg .i.-a tgc gta cty 
Val Glu Glu Lys Asp Phe Glu Lys Leu Tlir Tyr I.eu Lys Cys Val Leu 



350 



360 



lie;: 



aag gaa gtc ctt cgc etc coc cca ccc ate cca etc etc etc cac gag 
Lys Glu Val Leu Arg Leu His Pro Pro lie Pro Leu Leu Leu His Glu 
305 370 375 300 



1213 



act gec gag gac gec gag glc ggc ggc tac Lac att ccg gcg aaa Leg 
Thr Ala Glu Asp Ala Glu Val Gly Gly Tyr Tyr lie Pro Ala Lys Scr 
305 390 395 



1261 



egg gtg atg ate aac gcg tgc gec ate ggc egg gac aag aac teg Lgg 
Arg Val Hot lie Asn Ala Cys Ala lie Gly Arg Asp Lys Asn Scr Trp 



400 



405 



410 



1309 



gee gac cca gat, acg LLL agg ccc tec agg LLt etc aaa gac ggt gtg 
Ala Asp Pro Asp Thr Phe Arg Pro Ser Arg Phe Leu Lys Asp Gly Val 
^15 420 425 



1357 



ccc gat ttc aaa ggg aac aac ttc gag ttc ate cca ttc ggg tea ggt 
Pro Asp Phe Lys Gly Asn Asn Phe Glu Phe lie Pro Phe Gly Ser Gly 
430 435 440 



1405 



cgt egg tct tuc ccc ggt atg caa etc gga etc tac gcg eta gag acg 
Arg Arg Ser Cys Pro Gly Met Gin Leu Gly Leu Tyr Ala Leu Glu Thr 
445 450 455 460 



1453 



act gtg get cac etc ctt cac tgt ttc acg tgg gag ttg ccg gac ggg 
Thr Val Ala His Leu Leu His Cys Phe Thr Trp Glu Leu Pro Asp Gly 
465 470 475 



1501 



atg aaa ccg agt gaa etc gag atg aat gat gtg ttt gga etc acc gcg 
Met Lys Pro Ser Glu Leu Glu Met Asn Asp Val Phe Gly Leu Thr Ala 
400 4G5 490 



1549 



cca aga gcg att cga etc acc gee gtg ccg agt cca cgc ctt etc tgt 
Pro Arg Ala lie Arg Leu Thr Ala Val Pro Ser Pro Arg Leu Leu Cys 
495 500 505 



1597 



cct etc tat tgatcgaatg attgggggag ctttgtggag gggcttttat 
Pro Leu Tyr 
510 



1646 



Fig. 5C 
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ggng.ictcta tatatcigatg ggaaglgaaa 
atatattggg gagggagggg aaaaaaaaaa 
atttclcttc cticlgLggal: aaaagcctcg 
UjUttgCtUa L L L t tia tctc LLLUtLLgca 



S1ZQ ID 3 

caacgacagg Ugaa tgct: tg ga t tL Ltiggt 1.70G 
laalgaaagg aaa< r i;»aaga gagaatttga 17GG 
LLlLLaaUtg 1: L 1 1 U Lg tg gagaliat U tg 1G2G 
aLaacacLca aaaaLnaaaa aaaaaaa 1003 



Fig.5D 
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SHQ IO 4 

<400> A 

McL Asp Scr Scr Leu His Glu Ala Leu Gin Pro Lou Pro Met Thr Leu 

15 



1 5 io 



Pho Phc lie 1 Lc Pro Leu Leu Leu Leu Leu Gly Leu Val Ser Airg Leu 
20 25 30 

Arg Gin Arg Leu Pro Tyr Pro Pro Gly Pro Lys Gly Levi Pro Val lie 
35 40 45 

GJy Asn Mel Leu Met: Met Asp Gin Leu Thr His Arg Gly Leu Ala Lys 
50 55 GO 

Leu Ala Lys Gin Tyr GJ.y Gly Leu Phe Mis Leu Lys McL Gly Phc Leu 
65 70 "?5 oo 

ilis Met Vol Ala Val Ser Thr Pro Asp Met Ala Arg Gin Val Leu Gin 
05 90 95 

Val Gin Asp Asn He Phe Scr Asn Arg Pro Ala Thr lie Ala He Ser 
100 105 HO 

Tyr Leu Thr Tyr Asp Arg Ala Asp Met Ala Phe Ala His Tyr Gly Pro 
115 120 125 

Phc Trp Arg Gin Met Arg Lys Leu Cys Val Met Lys Leu Phe Ser Arg 
130 135 140 

Lys Arg Ala Glu Scr Trp Glu Scr Val Arg Asp Glu Val Asp Scr Ala 
145 1^0 155 160 

Val Arg Val Val Ala Scr Asn He Gly Scr Thr Val Asn He Gly Glu 
165 170 175 

Leu Val Phe Ala Leu Thr Lys Asn He Thr Tyr Arg Ala Ala Phe Gly 
100 105 190 

Thr He Ser His Glu Asp Gin Asp Glu Phe Val Ala lie Leu Gin Glu 
195 200 205 

Phe Ser Gin Leu Phe Gly Ala Phe Asn He Ala Asp Phc He Pro Trp 
210 215 220 

Leu Lys Trp Val Pro Gin Gly He Asn Val Arg Leu Asn Lys Ala Arg 
22 5 230 235 2 40 



Fig. 5E 
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si:q id 4 

Gly Ala Leu Asp Gly Phe 11c Asp Lys lie 11c Asp Asp ilis lie Gin 
245 20O 2S5 

Lys Gly Ser Lys Asn Scr Glu CJ u Vnl Asp Thr Asp Met Val Asp Asp 
:'G0 265 270 

Lou Leu Ala Phe Tyr Gly Glu Glu Ala Lys Val Ser Glu oci Asp Asp 
275 200 205 

Lou GJn As n Scr lie Lys Leu Thr Lys Asp Asn lie Lys Ala lie Met 
290 295 300 

Asp Val Met Phe Gly Gly Thr Glu Thr Val Ala Ser Ala lie Glu Trp 
305 310 315 320 

Ala Mo L Thr Glu Leu Mot Lys Ser Pro Glu Asp Leu Lys Lys Val Gin 
325 330 335 

Gin Glu Leu Ala Val Val Val Gly Lc\j Asp Arg Arg Val Glu Glu Lys 
340 345 350 

Asp The Glu Lys Leu Thr Tyr Leu Lys Cys Val Leu Lys Glu Val Leu 
355 360 365 

Arg Leu His Pro Pro lie Pro Leu Leu Leu Ilis Glu Thr Ala Glu Asp 
370 375 300 

Ala Glu Val Gly Gly Tyr Tyr lie Pro Ala Lys Ser Arg Val Mel lie 
305 390 395 400 

Asn Ala Cys Ala lie Gly Arg Asp Lys Asn Ser Trp Ala Asp Pro Asp 
405 410 415 

Thr Phe Arg Pro Ser Arg Phe Leu Lys Asp Gly Val Pro Asp Phe Lys 
420 425 430 

Gly Asn Asn Phe Glu Phe lie Pro Phe Gly Ser Gly Arg Arg Ser Cys 
435 440 445 

Pro Gly Met Gin Leu Gly Leu Tyr Ala Leu Glu Thr Thr Val Ala His 
450 455 4G0 

Leu Leu His Cys Phe Thr Trp Glu Leu Pro Asp Gly Met Lys Pro Ser 
465 470 475 400 
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SliQ ID '1 

Glu Leu Glu He L Asn Asp Vol I'hc Gly Leu Tlir Ala I'ro Arg Ala lie 
105 /J90 /i 95 

Arq l,cu Tin- /VI. i V.H t'i.o Sci: I'co Aiy Lcti Leu Cyn l'i.o Leu Tyr 
500 505 510 



Fig. 5G 
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<400> 10 

aaacaccaat LtaaLgggat: Ltzcagalt q 

ttattgtaat c(. aaccaatt claaLlloea 

ccgaaaacag cgaaLgaaat gtctgggtga 

cgggtgLLgg cctagccggg aLgggggLag 
aaLggagttL Ucggggtagg tagtaacgLa 
caaaaatcca accgctcclit cacatcgcag 
caclcaatcg atcgcctgcc gtggtlgccc 
accaacaatt ccaggccggc tit tctialaca 
agccggcctc tgcttcct tc tcagtagccc 
Ucat ttgtc agacacgLtt tccgccaliLL 
gltcygat lg ggattgaatc aattgaaagg 



SliQ ID 10 

Utcccalgc LaLtggctaa ggcaUULULc 60 

ccctggtgtg aactgactga caaatgcggL 120 

tzcggtcaaac aagcggtggg cgagagagcg 100 

gtagacggcg tiatlaccggc gagttgtccg 240 

gacgtcaatg gaaaaagtca taatcticcg t 300 

agttggf.ggc cacgggaccc tccacccacl 360 

attattcaac catacgccac titgaclicttc 420 

aLgLacLgca caggaaaatc caatataaaa 400 

ccagctcaCt caattcttcc cactgcaggc 540 

ttcgccLgtt tctqcggaga atttgatcag 600 

t I tttatULt cagtatUtcg atcgccatg 659 



rig. 6 
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SliQID 1 1 

<A00> 11 

ggccgggtgg tgacalttat tcataaot tc atctcanaac aagaacjgatt ta< :aaaaa La CO 
aaagaaaacn aaottttcat; c'ltaocaLa attatan L tg tgttcacaaa attcaaactt 120 
aaacccLUa tataaagaat LLctttcaac aaLacacliLL aatcacnacl: Let tea a tea 100 
caacctcctc caacaaaatt aaaatagatt aataaataaa Laaacttaac tatttiaaaaa 240 
aaaatattat acaaaattta Ltaaaacttc aaaataaaca oactltttat acaaaattca 300 
tcaaaacttt aaaataaacjc Laaacactga aaatgtgagl acatltaaaa ggaegctgat 360 
cacaaaaatt ttgaaaacat aaacaaactt gaaaclctoc cttttaagaa tgagtttgtc 420 
gtctcattaa ctcattagtt ttatagttcg aatccaatta aegtatcttt tattttatgg 400 
aataagggtg LtltaaLaag tgattttggg atttttLLag taaLLLattt gtgatatgtt 540 
atggagtttt taaaaatata tatalatata tatatttttg ggttgagttt acttaaaatl 600 
tggaaaaggt Lggtaagaac tataaattga gttgtgaatg agtgttttat ggatttttta 660 
agatgttana tttatatatg taattaaaat tttattttga ataacaaaaa ttataattgg 720 
ataaaaaatt gttttgttaa atttagagLa aaaatttcaa aatctaaaat aattaaacac 700 
tattattttt aaaaaatttg ttggtaaatt Ltatcttata tttaagttaa aatttagaaa 040 
aaattaattt taaattaata aacttttgaa gtcaaatatt ccaaatattt tccaaaatat 900 
taaatctatt ttgcattcaa aatacaattt aaataataaa acttcatgga atagatta,?; 960 
caatttgtat aaaaaccaaa aatctcaaat aaaatttaaa ttacaaaaca ttatcaacat 1020 
tatgatttca agaaagacaa taaccagUt ccaataaaat aaaaaacctc atggcccgta 1000 
attaagatct cattaattaa ttcttatttt ttaatttttt taca taqaaa atatctttat 1140 
attgtatcca agaaatatag aatgttctcg tccagggact attaatctcc aaacaagttt 1200 
caaaatcatt acattaaagc tcatcatgtc atttgtggal tggaaattat attgtataag 1260 
agaaatatag aatgttctcg tctagggact attaatttcc aaacaaattt caaaat-.:att 1320 



rig. 7A 
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30 



SIIQ ID 



! 1 



acattaaacjc lea Lcalglc alltgtggal LggaaatLag acaaaaaaaa Lcccaaatal 1300 

LtcLcLcaal ctcccaaanL ataglLcgaa cLcca La L t L tLggaaaLtg agaaLttLLt 1140 

Lacccaalaa LatatLlttl taLacatltL agagaLLLLc cagaca La L t tgcLctggga 1500 

LLLatLggaa LgaaggLtga gtLaLaaacL LtcagLaaLc caagLaLcLL cggLttLLga 1 Li GO 

agatacLaaa LccaLlalat aaLaaaaaca caLLtLaaac accaaLLLaa Lgggatttca 1620 

gaLttgLaLc ccatgcLaLt ggctaaggca LtttLclLaL tgtaatctaa ccaattcLaa 1680 

LtLccacccL ggtgLgaacL gactgacaaa Lgcggtccga aaacagcgaa tgaaatgtct 1710 

gggLgaLcgg tcaaicaagc ggtgggcgag agagegeggg LgLtggccLa geegggatgg 1000 

gggLaggtag acggcgLalL accggcgagL LgLccgaaLg gagLLLLcgg ggtaggLagt 1060 

aacgtagacg LcaaLggaaa aagLcataaL ctccgLcaaa aatccaaccg ctccttcaca 1920 

tcgcagagLL ggLggccacg ggacccLcca cccactcacL cgaLcgccLg ccgLggttgc 1900 

ccattattca accatacgcc acttgactcL tcaccaacaa LLccaggccg gctttctata 2040 

caatgLactg cacaggaaaa tccaatataa aaagccggcc tcLgcLtcct tcLcagLagc 2100 

ccccagctca LLcaattctt cccactgcag gctacattLg tcagacacgL LttccgccaL 2160 

t L t Lcgcctg tLLctgcgga gaaLLLgatc aqgLtcggat Lggga L tgaa LcaaLLgaaa 2220 

ggtttLtatt ttcagtaLLL cgatcgccat g 2251 
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SllQ ID 9 

<<1Q0> 9 

aaagataala laLgtigtaty cclactacU cacat tgt lit ligaagtg tgC aaucataglg GO 
caacaclagg aggacLcaca aLgagcactL gltgacaliga aacLagctaa algcccaaca 120 
atai aglga aagctiagl La aactiaacccc U U t gacl: U Lc aann Lga la I a t t la La tec ]00 
ctactacglc ILccLcttll tglcltlctc Itglgatlaa acctlccltg aaacaattct 240 
caaalgtaaa attaaacclt gaaacLUgta gagaccaaac llccclagga gaaaccacat 300 
ttatqacaac alatatacac caacccaLLg calactataa lattggaall acctgcagcg 360 
aacgaaagaa acgclgtclc accaaclcgl gcactacatc ccgaaactta accttcccct 420 
gatacagatt gaagagcega aaaaagcgLg catccaaatt tclgglalgg tgaggagccg 400 
aaaaacgcgl gcgccLaalt ttltlgagat gggccggaaa alaalgcglg catctaaatt 510 
ttcacgtgtc gcglaltggc gaggttgege Lgaalgtgat cctgtgcgtg agccacallc 600 
altccaltgg tlgacccgcc ggtaccgega ggaccgtggg glclcacaga tacgcggaLg 660 
gtggalcagc actgagaaga tlagalgalg accaggeggg calltgaagt aaaaacttgg 720 
gggtggttigg caagtiacgcg acaaagaggg gtagtgcgca aggaagegag ttggatgcaa 700 
alaatattac aaaglgggtt ggtgggcatg agcatcaacc agaatgatgt tgttgctggl 040 
Iccgtgcaaa ttctgaccag taglttgaac aatactaccc aacttgtltl tgglaaaaca 900 
igaagtgggt aaggagaatt gaacltacgt ctcalggtaa agggcaaggg caaatgacU 960 
aacacatacc tttaactaat aaaaataccc ctaacaaata cgaaaacgaa tgagttatca 1020 
cagacctlca actaataaga tagccalcag acccacatct cctgactigac caaaaacaaa 1000 
tgacttcaac caactaagat acccatcaaa gctaacccac aacccaattc ctcacttccc 1140 
cttaccagac caaccaagca gacctacgcc attaactact ttaggacglg ggaatiLgggg 1200 
gtgccaccgt tgaagaatgg cactcagggt Iggtaatccc tccacglgla tgtagcagtc 1260 
gtttggtgga gacggcgtgt ttgaalgtcc accllccagt tlggagaaca aggaaaltgg 1320 
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gcttataULa gcjcclggatc tctligttiLca 

attcaagaat Lcaa t Lgccc tgccctgcUc 

gctctggtUU gLlcaatttic ttigacccctg 

cgaUUatata agtcat: 1 1 t:g gatccLtgca 



SIZQ ID 9 

gagcaggagL aglticaggac aggaaclagc 1380 

tgctclgcUU LgcLcaacLU at tgatcccU 1140 

ctgggttctg cLcLggl ttg cacactUtct 1500 

aggaagagaa UaUg 1541 



Fig. 8B 
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SEQUENCE LISTING 

<110> Chiang, Vincent L 
Carraway, Daniel T 
Smeltzer, Richard H 

<120> Production of Syringyl Lignin in Gymnosperms 

<130> 50617 

<140> US 00/991, 677 
<141> 1997-12-16 

<150> US 60/033, 381 
<151> 1996-12-16 

<160> 11 

<170> Patentln Ver. 2.0 

<210> 1 

<211> 1708 

<212> DNA 

<213> Liquidambar styraciflua 

<220> 

<221> CDS 

<222> (48) (1571) 

<400> 1 

cggcacgagg aaaccctaaa actcacctct cttacccttt ctcttca atg get ttc 56 

Met Ala Phe 
1 

ctt eta ata ccc ate tea ata ate ttc ate gtc tta get tac cag etc 104 
Leu Leu lie Pro lie Ser lie lie Phe lie Val Leu Ala Tyr Gin Leu 
5 10 15 

tat caa egg etc aga ttt aag etc cca ccc ggc cca cgt cca tgg ccg 152 
Tyr Gin Arg Leu Arg Phe Lys Leu Pro Pro Gly Pro Arg Pro Trp Pro 
20 25 30 35 

ate gtc gga aac ctt tac gac ata aaa ccg gtg agg ttc egg tgt ttc 200 
lie Val Gly Asn Leu Tyr Asp He Lys Pro Val Arg Phe Arg Cys Phe 
40 45 50 

gec gag tgg tea caa gcg tac ggt ccg ate ata teg gtg tgg ttc ggt 248 
Ala Glu Trp Ser Gin Ala Tyr Gly Pro He He Ser Val Trp Phe Gly 
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55 60 65 

tea acq ttg aat gtg ate gta teg aat teg gaa ttg get aag gaa gtg 296 

Ser Thr Leu Asn Val lie Val Ser Asn Ser Glu Leu Ala Lys Glu Val 

70 75 80 



344 



etc aag gaa aaa gat caa caa ttg get gat agg cat agg agt aga tea 
Leu Lys Glu Lys Asp Gin Gin Leu Ala Asp Arg His Arg Ser Arg Ser 
85 90 95 

get gee aaa ttt age agg gat ggg cag gac ctt ata tgg get gat tat 392 
Ala Ala Lys Phe Ser Arg Asp Gly Gin Asp Leu He Trp Ala Asp Tyr 
100 105 110 115 

gga cct cac tat gtg aag gtt aca aag gtt tgt ace etc gag ctt ttt 440 
Gly Pro His Tyr Val Lys Val Thr Lys Val Cys Thr Leu Glu Leu Phe 
120 125 130 

act cca aag egg ctt gaa get ctt aga ccc att aga gaa gat gaa gtt 488 
Thr Pro Lys Arg Leu Glu Ala Leu Arg Pro He Arg Glu Asp Glu Val 
135 140 145 

aca gec atg gtt gag tec att ttt aat gac act gcg aat cct gaa aat 536 
Thr Ala Met Val Glu Ser He Phe Asn Asp Thr Ala Asn Pro Glu Asn 
150 155 160 

tat ggg aag agt atg ctg gtg aag aag tat ttg gga gca gta gca ttc 584 
Tyr Gly Lys Ser Met Leu Val Lys Lys Tyr Leu Gly Ala Val Ala Phe 
165 170 175 

aac aac att aca aga etc gca ttt gga aag cga ttc gtg aat tea gag 632 
Asn Asn He Thr Arg Leu Ala Phe Gly Lys Arg Phe Val Asn Ser Glu 
180 185 190 195 

ggt gta atg gac gag caa gga ctt gaa ttt aag gaa att gtg gee aat 680 
Gly Val Met Asp Glu Gin Gly Leu Glu Phe Lys Glu He Val Ala Asn 
200 205 210 

gga etc aag ctt ggt gec tea ctt gca atg get gag cac att cct tgg 728 
Gly Leu Lys Leu Gly Ala Ser Leu Ala Met Ala Glu His He Pro Trp 
215 220 225 

etc cgt tgg atg ttc cca ctt gag gaa ggg gee ttt gee aag cat ggg 776 
Leu Arg Trp Met Phe Pro Leu Glu Glu Gly Ala Phe Ala Lys His Gly 
230 235 240 

gca cgt agg gac cga ctt acc aga get ate atg gaa gag cac aca ata 824 
Ala Arg Arg Asp Arg Leu Thr Arg Ala He Met Glu Glu His Thr He 
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245 250 251; 

gcc cgt aaa aag agt ggt gga gcc caa caa cat ttc gtg gat gca ttg 872 
Ala Arg Lys Lys Ser Gly Gly Ala Gin Gin His Phe Val Asp Ala Leu 
260 265 270 275 

etc acc eta caa gag aaa tat gac ctt age gag gac act att att ggg 920 
Leu Thr Leu Gin Glu Lys Tyr Asp Leu Ser Glu Asp Thr He He Gly 
280 285 290 

etc ctt tgg gat atg ate act gca ggc atg gac aca acc gca ate tct 968 
Leu Leu Trp Asp Met He Thr Ala Gly Met Asp Thr Thr Ala He Ser 
295 300 305 

gtc gaa tgg gcc atg gcc gag tta att aag aac cca agg gtg caa caa 1016 
Val Glu Trp Ala Met Ala Glu Leu He Lys Asn Pro Arg Val Gin Gin 
310 315 320 

aaa get caa gag gag eta gac aat gta ctt ggg tec gaa cgt gtc ctg 1064 
Lys Ala Gin Glu Glu Leu Asp Asn Val Leu Gly Ser Glu Arg Val Leu 
325 330 335 

acc gaa ttg gac ttc tea age etc cct tat eta caa tgt gta gcc aag 1112 
Thr Glu Leu Asp Phe Ser Ser Leu Pro Tyr Leu Gin Cys Val Ala Lys 
340 345 350 355 

gag gca eta agg ctg cac cct cca aca cca eta atg etc cct cat cgc 1160 
Glu Ala Leu Arg Leu His Pro Pro Thr Pro Leu Met Leu Pro His Arg 
360 365 370 

gcc aat gcc aac gtc aaa att ggt ggc tac gac ate cct aag gga tea 1208 
Ala Asn Ala Asn Val Lys He Gly Gly Tyr Asp He Pro Lys Gly Ser 
375 380 385 

aat gtt cat gta aat gtc tgg gcc gtg get cgt gat cca gca gtg tgg 1256 
Asn Val His Val Asn Val Trp Ala Val Ala Arg Asp Pro Ala Val Trp 
390 395 400 

cgt gac cca eta gag ttt cga ccg gaa egg ttc tct gaa gac gat gtc 1304 
Arg Asp Pro Leu Glu Phe Arg Pro Glu Arg Phe Ser Glu Asp Asp Val 
405 410 415 

gac atg aaa ggt cac gat tat agg eta ctg ccg ttt ggt gca ggg agg 1352 
Asp Met Lys Gly His Asp Tyr Arg Leu Leu Pro Phe Gly Ala Gly Arg 
420 425 430 435 

cgt gtt tgc ccc ggt gca caa ctt ggc ate aat ttg gtc aca tec atg 1400 
Arg Val Cys Pro Gly Ala Gin Leu Gly lie Asn Leu Val Thr Ser Met 
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440 



445 



450 



atg ggt cac eta ttg cac cat ttc tat tgg age cct cct aaa ggt gta 

Met Gly His Leu Leu His His Phe Tyr Trp Ser Pro Pro Lys Giy Val 

455 460 465 

aaa cca gag gag att gac atg tea gag aat cca gga ttg gtc acc tac 

Lys Pro Glu Glu lie Asp Met Ser Glu Asn Pro Gly Leu Val Thr Tyr 
470 475 480 

atg cga acc ccg gtg caa get gtt ccc act cca agg ctg cct get cac 

Met Arg Thr Pro Val Gin Ala Val Pro Thr Pro Arg Leu Pro Ala His 
485 490 495 

ttg tac aaa cgt gta get gtg gat atg taattcttag tttgttatta 
Leu Tyr Lys Arg Val Ala Val Asp Met 
500 505 



1448 



1496 



1544 



1591 



ttcatgetet taaggttttg gactttgaac ttatgatgag atttgtaaaa ttccaagtga 1651 
tcaaatgaag aaaagaccaa ataaaaaggc ttgacgattt aaaaaaaaaa aaaaaaa 1708 



<210> 2 
<211> 508 
<212> PRT 

<213> Liquidambar styraciflua 
<400> 2 

Met Ala Phe Leu Leu lie Pro He Ser He He Phe He Val Leu Ala 
15 10 15 

Tyr Gin Leu Tyr Gin Arg Leu Arg Phe Lys Leu Pro Pro Gly Pro Arg 
20 25 30 

Pro Trp Pro lie Val Gly Asn Leu Tyr Asp He Lys Pro Val Arg Phe 
35 40 45 

Arg Cys Phe Ala Glu Trp Ser Gin Ala Tyr Gly Pro He He Ser Val 
50 55 60 

Trp Phe Gly Ser Thr Leu Asn Val He Val Ser Asn Ser Glu Leu Ala 
65 70 75 80 

Lys Glu Val Leu Lys Glu Lys Asp Gin Gin Leu Ala Asp Arg His Arg 
85 90 95 

Ser Arg Ser Ala Ala Lys Phe Ser Arg Asp Gly Gin Asp Leu He Trp 
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100 



105 



110 



Ala Asp Tyr Gly 
115 

Glu Leu Phe Thr 
130 

Asp Glu Val Thr 
145 

Pro Glu Asn Tyr 



Val Ala Phe Asn 
180 

Asn Ser Glu Gly 
195 

Val Ala Asn Gly 
210 

lie Pro Trp Leu 
225 

Lys His Gly Ala 



His Thr lie Ala 
260 

Asp Ala Leu Leu 
275 

He He Gly Leu 
290 

Ala He Ser Val 
305 

Val Gin Gin Lys 



Arg Val Leu Thr 
340 

Val Ala Lys Glu 



Pro His Tyr Val 
120 

Pro Lys Arg Leu 
135 

Ala Met Val Glu 
150 

Gly Lys Ser Met 
165 

Asn He Thr Arg 



Val Met Asp Glu 
200 

Leu Lys Leu Gly 
215 

Arg Trp Met Phe 
230 

Arg Arg Asp Arg 
245 

Arg Lys Lys Ser 



Thr Leu Gin Glu 
280 

Leu Trp Asp Met 

295 

Glu Trp Ala Met 
310 

Ala Gin Glu Glu 
325 

Glu Leu Asp Phe 



Ala Leu Arg Leu 



Lys Val Thr Lys 



Glu Ala Leu Arg 
140 

Ser lie Phe Asn 
155 

Leu Val Lys Lys 
170 

Leu Ala Phe Gly 
185 

Gin Gly Leu Glu 



Ala Ser Leu Ala 
220 

Pro Leu Glu Glu 
235 

Leu Thr Arg Ala 
250 

Gly Gly Ala Gin 
265 

Lys Tyr Asp Leu 



He Thr Ala Gly 
300 

Ala Glu Leu He 
315 

Leu Asp Asn Val 
330 

Ser Ser Leu Pro 
345 

His Pro Pro Thr 



Val Cys Thr Leu 
125 

Pro He Arg Glu 



Asp Thr Ala Asn 
160 

Tyr Leu Gly Ala 
175 

Lys Arg Phe Val 
190 

Phe Lys Glu II o 
205 

Met Ala Glu His 



Gly Ala Phe Ala 
240 

He Met Glu Glu 
255 

Gin His Phe Val 
270 

Ser Glu Asp Thr 
285 

Met Asp Thr Thr 



Lys Asn Pro Arg 
320 

Leu Gly Ser Glu 
335 

Tyr Leu Gin Cys 
350 

Pro Leu Met Leu 
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355 

Pro His Arg Ala 
370 

Lys Gly Ser Asn 
385 

Ala Val Trp Arg 



Asp Asp Val Asp 
420 

Ala Gly Arg Arg 
435 

Thr Ser Met Met 
450 

Lys Gly Val Lys 
4 65 

Val Thr Tyr Met 



Pro Ala His Leu 
500 



360 

Asn Ala Asn Val 
375 

Val His Val Asn 
390 

Asp Pro Leu Glu 
405 

Met Lys Gly His 



Val Cys Pro Gly 
440 

Gly His Leu Leu 
455 

Pro Glu Glu lie 
470 

Arg Thr Pro Val 
485 

Tyr Lys Arg Val 



Lys lie Gly Gly 
380 

Val Trp Ala Val 
395 

Phe Arg Pro Glu 
410 

Asp Tyr Arg Leu 
425 

Ala Gin Leu Gly 



His His Phe Tyr 
460 

Asp Met Ser Glu 
475 

Gin Ala Val Pro 
490 

Ala Val Asp Met 
505 



365 

Tyr Asp lie Pro 



Ala Arg Asp Pro 
400 

Arg Phe Ser Glu 
415 

Leu Pro Phe Gly 
430 

lie Asn Leu Val 
445 

Trp Ser Pro Pro 



Asn Pro Gly Leu 
480 

Thr Pro Arg Leu 
495 



<210> 3 
<211> 1883 
<212> DNA 

<213> Liquidambar styraciflua 

<220> 

<221> CDS 

<222> (74) . . (1606) 

<400> 3 

tgcaaacctg cacaaacaaa gagagagaag aagaaaaagg aagagaggag agagagagag 60 

agagagagaa gcc atg gat tct tct ctt cat gaa gcc ttg caa cca eta 109 
Met Asp Ser Ser Leu His Glu Ala Leu Gin Pro Leu 
15 10 

ccc atg acg ctg ttc ttc att ata cct ttg eta etc tta ttg ggc eta 157 
Pro Met Thr Leu Phe Phe lie He Pro Leu Leu Leu Leu Leu Gly Leu 
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15 



20 



gta tct egg ctt cgc cag aga eta cca tac cca cca ggc cca aaa ggc 205 

Val Ser Arg Leu Arg Gin Arg Leu Pro Tyr Pro Pro Gly Pro Lys Gly 
30 35 40 

tta ccg gtg ate gga aac atg etc atg atg gat caa etc act cac cga 253 

Leu Pro Val lie Gly Asn Met Leu Met Met Asp Gin Leu Thr His Arg 
45 50 55 60 

gga etc gec aaa etc gec aaa caa tac ggc ggt eta ttc cac etc aag 301 

Gly Leu Ala Lys Leu Ala Lys Gin Tyr Gly Gly Leu Phe His Leu Lys 
65 70 75 

atg gga ttc tta cac atg gtg gec gtt tec aca ccc gac atg get cgc 349 

Met Gly Phe Leu His Met Val Ala Val Ser Thr Pro Asp Met Ala Arg 
80 85 90 

caa gtc ctt caa gtc caa gac aac ate ttc teg aac egg cca gec acc 397 

Gin Val Leu Gin Val Gin Asp Asn He Phe Ser Asn Arg Pro Ala Thr 

95 100 105 

ata gec ate age tac etc acc; tat gac cga gec gac atg gee ttc get 445 

He Ala He Ser Tyr Leu Thr Tyr Asp Arg Ala Asp Met Ala Phe Ala 
110 115 120 

cac tac ggc ccg ttt tgg cgt cag atg cgt aaa etc tgc gtc atg aaa 493 

His Tyr Gly Pro Phe Trp Arg Gin Met Arg Lys Leu Cys Val Met Lys 
125 130 135 140 

tta ttt age egg aaa cga gec gag teg tgg gag teg gtc cga gac gag 541 

Leu Phe Ser Arg Lys Arg Ala Glu Ser Trp Glu Ser Val Arg Asp Glu 
145 150 155 

gtc gac teg gca gta cga gtg gtc gcg tec aat att ggg teg acg gtg 589 

Val Asp Ser Ala Val Arg Val Val Ala Ser Asn He Gly Ser Thr Val 

160 165 170 

aat ate ggc gag ctg gtt ttt get ctg acg aag aat att act tac agg 637 

Asn He Gly Glu Leu Val Phe Ala Leu Thr Lys Asn He Thr Tyr Arg 

175 1 '180 185 

gcg get ttt ggg acg ate teg cat gag gac cag gac gag ttc gtg gee 685 

Ala Ala Phe Gly Thr He Ser His Glu Asp Gin Asp Glu Phe Val Ala 
190 195 200 

ata ctg caa gag ttt teg cag'ctg ttt ggt get ttt aat ata get gat 733 

He Leu Gin Glu Phe Ser Gin Leu Phe Gly Ala Phe Asn He Ala Asp 
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205 210 215 220 

ttt ate cct tgg etc aaa tgg gtt cct cag ggg att aac gtc agg etc 781 
Phe lie Pro Trp Leu Lys Trp Val Pro Gin Gly lie Asn Val Arg Leu 
225 230 235 

aac aag gca cga ggg gcg ctt gat ggg ttt att gac aag ate ate gac 829 
Asn Lys Ala Arg Gly Ala Leu Asp Gly Phe He Asp Lys He He Asp 
240 245 250 

gat cat ata cag aag ggg agt aaa aac teg gag gag gtt gat act gat 877 
Asp His He Gin Lys Gly Ser Lys Asn Ser Glu Glu Val Asp Thr Asp 
255 260 265 

atg gta gat gat tta ctt get ttt tac ggt gag gaa gec aaa gta age 925 
Met Val Asp Asp Leu Leu Ala Phe Tyr Gly Glu Glu Ala Lys Val Ser 
270 275 280 

gaa tct gac gat ctt caa aat tec ate aaa etc acc aaa gac aac ate 973 
Glu Ser Asp Asp Leu Gin Asn Ser He Lys Leu Thr Lys Asp Asn He 
285 290 295 300 

aaa get ate atg gac gta atg ttt gga ggg acc gaa acg gtg gcg tec 1021 
Lys Ala He Met Asp Val Met Phe Gly Gly Thr Glu Thr Val Ala Ser 
305 310 315 

gcg att gaa tgg gec atg acg gag ctg atg aaa age cca gaa gat eta 1069 
Ala He Glu Trp Ala Met Thr Glu Leu Met Lys Ser Pro Glu Asp Leu 
320 325 330 

aag aag gtc caa caa gaa etc gee gtg gtg gtg ggt ctt gac egg cga 1117 
Lys Lys Val Gin Gin Glu Leu Ala Val Val Val Gly Leu Asp Arg Arg 
335 340 345 

gtc gaa gag aaa gac ttc gag aag etc acc tac ttg aaa tgc gta ctg 1165 
Val Glu Glu Lys Asp Phe Glu Lys Leu Thr Tyr Leu Lys Cys Val Leu 
350 355 360 

aag gaa gtc ctt cgc etc cac cca ccc ate cca etc etc etc cac gag 1213 
Lys Glu Val Leu Arg Leu His Pro Pro He Pro Leu Leu Leu His Glu 
365 370 375 380 

act gec gag gac gec gag gtc ggc ggc tac tac att ccg gcg aaa teg 1261 
Thr Ala Glu Asp Ala Glu Val Gly Gly Tyr Tyr He Pro Ala Lys Ser 
385 390 395 

egg gtg atg ate aac gcg tgc gee ate ggc egg gac aag aac teg tgg 1309 
Arg Val Met He Asn Ala Cys Ala He Gly Arg Asp Lys Asn Ser Trp 
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400 405 410 

gcc gac cca gat acg ttt agg ccc tec agg ttt etc aaa gac ggt gtg 1357 
Ala Asp Pro Asp Thr Phe Arg Pro Ser Arg Phe Leu Lys Asp Gly Val 
415 420 425 

ccc gat ttc aaa ggg aac aac ttc gag ttc ate cca ttc ggg tea ggt 1405 
Pro Asp Phe Lys Gly Asn Asn Phe Glu Phe lie Pro Phe Gly Ser Gly 
430 435 440 

cgt egg tct tgc ccc ggt atg caa etc gga etc tac gcg eta gag acg 1453 
Arg Arg Ser Cys Pro Gly Met Gin Leu Gly Leu Tyr Ala Leu Glu Thr 
445 450 455 460 

act gtg get cac etc ctt cac tgt ttc acg tgg gag ttg ccg gac ggg 1501 
Thr Val Ala His Leu Leu His Cys Phe, Thr Trp Glu Leu Pro Asp Gly 
465 470 475 

atg aaa ccg agt gaa etc gag atg aat gat gtg ttt gga etc acc gcg 1549 
Met Lys Pro Ser Glu Leu Glu Met Asn Asp Val Phe Gly Leu Thr Ala 
480 485 490 

cca aga gcg att cga etc acc gcc gtg ccg agt cca cgc ctt etc tgt 1597 
Pro Arg Ala He Arg Leu Thr Ala Val Pro Ser Pro Arg Leu Leu Cys 
495 500 505 

cct etc tat tgatcgaatg attgggggag ctttgtggag gggcttttat 1646 

Pro Leu Tyr ... 
510 

ggagactcta tatatagatg ggaagtgaaa caacgacagg tgaatgcttg gatttttggt 1706 

atatattggg gagggagggg aaaaaaaaaa taatgaaagg aaagaaaaga gagaatttga 1766 

atttctcttc ctctgtggat aaaagecteg tttttaattg tttttatgtg gagatatttg 1826 

tgtttgttta tttttatctc tttttttgea ataacactca aaaataaaaa aaaaaaa 1883 

<210> 4 

<211> 511 

<212> PRT 

<213> Liquidambar styraciflua 

<400> 4 

Met Asp Ser Ser Leu His Glu Ala Leu Gin Pro Leu Pro Met Thr Leu 
1 5 ; 10 15 
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Phe Phe He He 
20 

Arg Gin Arg Leu 
35 

Gly Asn Met Leu 
50 

Leu Ala Lys Gin 
65 

His Met Val Ala 



Val Gin Asp Asn 
100 

Tyr Leu Thr Tyr 
115 

Phe Trp Arg Gin 
130 

Lys Arg Ala Glu 
145 

Val Arg Val Val 



Leu Val Phe Ala 
180 

Thr He Ser His 
195 

Phe Ser Gin Leu 
210 

Leu Lys Trp Val 
225 

Gly Ala Leu Asp 



Lys Gly Ser Lys 
260 



Pro Leu Leu Leu 



Pro Tyr Pro Pro 
40 

Met Met Asp Gin 
55 

Tyr Gly Gly Leu 
70 

Val Ser Thr Pro 
85 

He Phe Ser Asn 



Asp Arg Ala Asp 
120 

Met Arg Lys Leu 
135 

Ser Trp Glu Ser 
150 

Ala Ser Asn He 
165 

Leu Thr Lys Asn 



Glu Asp Gin Asp 
200 

Phe Gly Ala Phe 
215 

Pro Gin Gly He 
230 

Gly Phe He Asp 
245 

Asn Ser Glu Glu 



Leu Leu Gly Leu 
25 

Gly Pro Lys Gly 



Leu Thr His Arg 
60 

Phe His Leu Lys 
75 

Asp Met Ala Arg 
90 

Arg Pro Ala Thr 
105 

Met Ala Phe Ala 



Cys Val Met Lys 
140 

Val Arg Asp Glu 
155 

Gly Ser Thr Val 
170 

He Thr Tyr Arg 
185 

Glu Phe Val Ala 



Asn He Ala Asp 
220 

Asn Val Arg Leu 
235 

Lys He He Asp 
250 

Val Asp Thr Asp 
265 



Val Ser Arg Leu 
30 

Leu Pro Val He 
45 

Gly Leu Ala Lys 



Met Gly Phe Leu 
80 

Gin Val Leu Gin 
95 

He Ala He Ser 
110 

His Tyr Gly Pro 
125 

Leu Phe Ser Arg 



Val Asp Ser Ala 
160 

Asn He Gly Glu 
175 

Ala Ala Phe Gly 
190 

He Leu Gin Glu 
205 

Phe He Pro Trp 



Asn Lys Ala Arg 
240 

Asp His He Gin 
255 

Met Val Asp Asp 
270 
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Leu Leu Ala Phe Tyr Gly Glu Glu Ala Lys Val Scr Glu Ser Asp Asp 
275 280 285 

Leu Gin Asn Ser lie Lys Leu Thr Lys Asp Asn lie Lys Ala lie Met 
290 295 300 

Asp Val Met Phe Gly Gly Thr Glu Thr Val Ala Ser Ala He Glu Trp 
305 310 315 320 

Ala Met Thr Glu Leu Met Lys Ser Pro Glu Asp Leu Lys Lys Val Gin 
325 330 335 

Gin Glu Leu Ala Val Val Val Gly Leu Asp Arg Arg Val Glu Glu Lys 
340 345 350 

Asp Phe Glu Lys Leu Thr Tyr Leu Lys Cys Val Leu Lys Glu Val Leu 
355 360 365 

Arg Leu His Pro Pro He Pro Leu Leu Leu His Glu Thr Ala Glu Asp 
370 375 380 

Ala Glu Val Gly Gly Tyr Tyr He Pro Ala Lys Ser Arg Val Met He 
385 390 395 400 

Asn Ala Cys Ala He Gly Arg Asp Lys Asn Ser Trp Ala Asp Pro Asp 
405 410 415 

Thr Phe Arg Pro Ser Arg Phe Leu Lys Asp Gly Val Pro Asp Phe Lys 
420 425 430 

Gly Asn Asn Phe Glu Phe He Pro Phe Gly Ser Gly Arg Arg Ser Cys 
435 440 445 

Pro Gly Met Gin Leu Gly Leu Tyr Ala Leu Glu Thr Thr Val Ala His 
450 455 460 

Leu Leu His Cys Phe Thr Trp Glu Leu Pro Asp Gly Met Lys Pro Ser 
465 470 475 480 

Glu Leu Glu Met Asn Asp Val Phe Gly Leu Thr Ala Pro Arg Ala He 
485 490 495 



Arg Leu Thr Ala Val Pro Ser Pro Arg Leu Leu Cys Pro Leu Tyr 
500 505 510 



<210> 5 
<211> 1380 
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<212> DNA 

<213> Liquidambar styraciflua 



<220> 

<221> CDS 

<222> (67) . . (1170) 



<400> 5 

cggcacgagc cctacctcct ttcttggaaa aatttcccca ttcgatcaca atccgggcct 60 



caaaaa atg gga tea aca age gaa acg aag atg age ccg agt gaa gca 
Met Gly Ser Thr Ser Glu Thr Lys Met Ser Pro Ser Glu Ala 
15 10 



108 



gca gca gca gaa gaa gaa gca ttc gta ttc get atg caa tta acc agt 
Ala Ala Ala Glu Glu Glu Ala Phe Val Phe Ala Met Gin Leu Thr Ser 
15 20 25 30 



get tea gtt ctt ccc atg gtc eta aaa tea gee ata gag etc gac gtc 204 
Ala Ser Val Leu Pro Met Val Leu Lys Ser Ala He Glu Leu Asp Val 
35 40 45 



tta gaa ate atg get aaa get ggt cca ggt gcg cac ata tec aca tct 252 
Leu Glu He Met Ala Lys Ala Gly Pro Gly Ala His He Ser Thr Ser 
50 55 60 



gac ata gee tct aag ctg ccc aca aag aat cca gat gca gee gtc atg 300 
Asp He Ala Ser Lys Leu Pro Thr Lys Asn Pro Asp Ala Ala Val Met 
65 70 75 



ctt gac cgt atg etc cgc etc ttg get age tac tct gtt eta acg tgc 348 
Leu Asp Arg Met Leu Arg Leu Leu Ala Ser Tyr Ser Val Leu Thr Cys 
80 85 90 



tct etc cgc acc etc cct gac ggc aag ate gag agg ctt tac ggc ctt 
Ser Leu Arg Thr Leu Pro Asp Gly Lys He Glu Arg Leu Tyr Gly Leu 
95 100 105 110 



gca ccc gtt tgt aaa ttc ttg acc aga aac gat gat gga gtc tec ata 
Ala Pro Val Cys Lys Phe Leu Thr Arg Asn Asp Asp Gly Val Ser He 
115 120 125 



444 



gee get ctg tct etc atg aat caa gac aag gtc etc atg gag age tgg 492 
Ala Ala Leu Ser Leu Met Asn Gin Asp Lys Val Leu Met Glu Ser Trp 
130 135 140 



tac cac ttg acc gag gca gtt ctt gaa ggt gga att cca ttt aac aag 540 
Tyr His Leu Thr Glu Ala Val Leu Glu Gly Gly He Pro Phe Asn Lys 
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145 150 155 

gcc tat gga atg aca gca ttt gag tac cat ggc acc gat ccc aga ttc 580 

Ala Tyr Gly Met Thr Ala Phe Glu Tyr His Gly Thr Asp Pro Arg Phe 
160 165 170 

aac aca gtt ttc aac aat gga atg tec aat cat teg acc att acc atg 636 

Asn Thr Val Phe Asn Asn Gly Met Ser Asn His Ser Thr He Thr Met 
175 . 180 185 190 



aag aaa ate ctt gag act tac aaa ggg ttc gag gga ctt gga tct gtg 
Lys Lys He Leu Glu Thr Tyr Lys Gly Phe Glu Gly Leu Gly Ser Val 
195 200 205 



gaa gca ctt cca acc aat ggg aag gtg ate ctt get gaa tgc ate etc 
Glu Ala Leu Pro Thr Asn Gly Lys Val He Leu Ala Glu Cys He Leu 
290 295 300 



gat gtc ate atg ttg get cat aac cca ggt ggg aaa gag aga act gag 
Asp Val He Met Leu Ala His Asn Pro Gly Gly Lys Glu Arg Thr Glu 
320 325 330 



684 



gtt gat gtt ggt ggt ggc act ggt gcc cac ctt aac atg att ate get 732 
Val Asp Val Gly Gly Gly Thr Gly Ala His Leu Asn Met He He Ala 
210 215 220 

aaa tac ccc atg ate aag ggc att aac ttc gac ttg cct cat gtt att 780 
Lys Tyr Pro Met He Lys Gly He Asn Phe Asp Leu Pro His Val He 
225 230 235 

gag gag get ccc tec tat cct ggt gtg gag cat gtt ggt gga gat atg 828 
Glu Glu Ala Pro Ser Tyr Pro Gly Val Glu His Val Gly Gly Asp Met 
240 245 250 

ttt gtt agt gtt cca aaa gga gat gcc att ttc atg aag tgg ata tgt 876 
Phe Val Ser Val Pro Lys Gly Asp Ala He Phe Met Lys Trp He Cy ~ 
255 260 265 270 

cat gat tgg age gat gaa cac tgc ttg aag ttt ttg aag aaa tgt tat 924 
His Asp Trp Ser Asp Glu His Cys Leu Lys Phe Leu Lys Lys Cys Tyr 
275 280 285 



972 



ccc gtg gcg cca gac gca age etc ccc act aag gca gtg gtc cat att 1020 
Pro Val Ala Pro Asp Ala Ser Leu Pro Thr Lys Ala Val Val His He 
305 310 315 



1060 



aag gag ttt gag gcc ttg gcc aag ggg get gga ttt gaa ggt ttc cga 1116 
Lys Glu Phe Glu Ala Leu Ala Lys Gly Ala Gly Phe Glu Gly Phe Arg 
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335 340 345 350 

gta gta gcc teg tgc get tac aat aca tgg ate ate gaa ttt ttg aag 1164 
Val Val Ala Ser Cys Ala Tyr Asn Thr Trp lie lie Glu Phe Leu Lys 
355 360 365 

aag att tgagtcctta cteggctttg agtacataat accaactcct tttggttttc 1220 
Lys He 

gagattgtga ttgtgattgt gattgtctct etttegcagt tggccttatg atataatgta 1280 
tegttaaetc gatcacagaa gtgeaaaaga cagtgaatgt acactgettt ataaaataaa 1340 
aattttaaga ttttgattca tgtaaaaaaa aaaaaaaaaa 1380 



<210> 6 
<211> 360 
<212> PUT 

<213> Liquidambar styraciflua 
<400> 6 

Met Gly Ser Thr Ser Glu Thr Lys Met Ser Pro Ser Glu Ala Ala Ala 
15 10 15 

Ala Glu Glu Glu Ala Phe Val Phe Ala Met Gin Leu Thr Ser Ala Ser 
20 25 30 

Val Leu Pro Met Val Leu Lys Ser Ala He Glu Leu Asp Val Leu Glu 
35 40 45 

He Met Ala Lys Ala Gly Pro Gly Ala His He Ser Thr Ser Asp He 
50 55 60 

Ala Ser Lys Leu Pro Thr Lys Asn Pro Asp Ala Ala Val Met Leu Asp 
65 70 75 80 

Arg Met Leu Arg Leu Leu Ala Ser Tyr Ser Val Leu Thr Cys Ser Leu 
85 90 95 

Arg Thr Leu Pro Asp Gly Lys He Glu Arg Leu Tyr Gly Leu Ala Pro 
100 105 110 

Val Cys Lys Phe Leu Thr Arg Asn Asp Asp Gly Val Ser He Ala Ala 
115 120 125 

Leu Ser Leu Met Asn Gin Asp Lys Val Leu Met Glu Ser Trp Tyr His 
130 135 140 
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Leu Thr Glu Ala Val Leu Glu Gly 
145 150 

Gly Met Thr Ala Phe Glu Tyr His 
165 

Val Phe Asn Asn Gly Met Ser Asn 
180 

lie Leu Glu Thr Tyr Lys Gly Phe 

195 200 

Val Gly Gly Gly Thr Gly Ala His 
210 215 



Gly He Pro Phe Asn Lys Ala Tyr 
155 160 

Gly Thr Asp Pro Arg Phe Asn Thr 
170 175 

His Ser Thr He Thr Met Lys Lys 
185 190 

Glu Gly Leu Gly Ser Val Val Asp 
205 

Leu Asn Met He He Ala Lys Tyr 
220 



Pro Met He Lys 
225 

Ala Pro Ser Tyr 



Ser. Val Pro Lys 
260 

Trp Ser Asp Glu 
275 

Leu Pro Thr Asn 
290 

Ala Pro Asp Ala 
305 

He Met Leu Ala 



Phe Glu Ala Leu 
340 

Ala Ser Cys Ala 
355 



Gly He Asn Phe 
230 

Pro Gly Val Glu 
245 

Gly Asp Ala He 



His Cys Leu Lys 
280 

Gly Lys Val He 
295 

Ser Leu Pro Thr 
310 

His Asn Pro Gly 
325 

Ala Lys Gly Ala 



Tyr Asn Thr Trp 
360 



Asp Leu Pro His 
235 

His Val Gly Gly 
250 

Phe Met Lys Trp 
265 

Phe Leu Lys Lys 



Leu Ala Glu Cys 
300 

Lys Ala Val Val 
315 

Gly Lys Glu Arg 
330 

Gly Phe Glu Gly 
345 

He He Glu Phe 



Val He Glu Glu 
240 

Asp Met Phe Val 
255 

He Cys His Asp 
270 

Cys Tyr Glu Ala 
285 

He Leu Pro Val 



His He Asp Val 
320 

Thr Glu Lys Glu 
335 

Phe Arg Val Val 
350 

Leu Lys Lys He 
365 



<210> 7 
<211> 2025 
<212> DNA 

<213> Liquidambar styraciflua 
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<220> 

<221> CDS 

<222> (60) . . (1679) 

<400> 7 

cggcacgagc tcattttcca cttctggttt gatctctgca attcttccat cagtcccta 59 

atg gag acc caa aca aaa caa gaa gaa ate ata tat egg teg aaa etc 107 
Met Glu Thr Gin Thr Lys Gin Glu Glu lie lie Tyr Arg Ser Lys Leu 
15 10 15 

ccc gat ate tac ate ccc aaa cac etc cct tta cat teg tat tgt ttc 155 
Pro Asp He Tyr He Pro Lys His Leu Pro Leu His Ser Tyr Cys Phe 
20 25 30 

gag aac ate tea cag ttc ggc tec cgc ccc tgt ctg ate aat ggc gca 203 
Glu Asn He Ser Gin Phe Gly Ser Arg Pro Cys Leu He Asn Gly Ala 
35 40 45 

acg ggc aag tat tac aca tat get gag gtt gag etc att gcg cgc aag 251 
Thr Gly Lys Tyr Tyr Thr Tyr Ala Glu Val Glu Leu He Ala Arg Lys 
50 55 60 

gtc gca tec ggc etc aac aaa etc ggc gtt cga caa ggt gac ate ate 299 
Val Ala Ser Gly Leu Asn Lys Leu Gly Val Arg Gin Gly Asp He lie 
65 70 75 80 

atg ctt ttg eta ccc aac teg ccg gag ttc gtg ttt tea att etc ggc 347 
Met Leu Leu Leu Pro Asn Ser Pro Glu Phe Val Phe Ser He Leu Gly 
85 90 95 

gca tec tac cgc ggg get gec gec acc gec gca aac ccg ttt tat acc 395 
Ala Ser Tyr Arg Gly Ala Ala Ala Thr Ala Ala Asn Pro Phe Tyr Thr 
100 105 110 

cct gee gag ate agg aag caa gec aaa acc tec aac gee agg ctt att 443 
Pro Ala Glu He Arg Lys Gin Ala Lys Thr Ser Asn Ala Arg Leu He 
115 120 125 

ate aca cat gee tgt tac tat gag aaa gtg aag gac ttg gtg gaa gag 491 
He Thr His Ala Cys Tyr Tyr Glu Lys Val Lys Asp Leu Val Glu Glu 
130 135 140 

aac gtt gee aag ate ata tgt ata gac tea ccc ccg gac ggt tgt ttg 539 
Asn Val Ala Lys lie lie Cys He Asp Ser Pro Pro Asp Gly Cys Leu 
145 150 155 160 
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cac ttc teg 
His Phe Ser 



gag att gac 
Glu lie Asp 



acg : ggt tta 
Thr Gly Leu 
195 

agt gtg gcg 
Ser Val Ala 
210 

age gag gac 
Ser Glu Asp 
225 

atg aac gtc 
Met Asn Val 



atg cag aaa 
Met Gin Lys 



ggt gac cat 
Gly Asp His 
275 

act ccg gat 
Thr Pro Asp 
290 

tea ggt gcg 
Ser Gly Ala 
305 

aag ttt ccc 
Lys Phe Pro 



ggg ccc gtg 
Gly Pro Val 



gag ctg agt 
Glu Leu Ser 
165 

ccc gat gat 
Pro Asp Asp 
180 

cca aag ggg 
Pro Lys Gly 



caa cag gtg 
Gin Gin Val 



gtg gtt ctg 
Val Val Leu 
230 

atg ttt tgc 
Met Phe Cys 
245 

ttt gaa ata 
Phe Glu He 
260 

cat gec tat 
His Ala Tyr 

ctt cac aac 
Leu His Asn 



get cct ctg 
Ala Pro Leu 
310 

acc gec aaa 
Thr Ala Lys 
325 

eta gcg atg 
Leu Ala Met 
340 



gag gcg gac 
Glu Ala Asp 



gtg gtg gcg 
Val Val Ala 
185 

gtg atg eta 
Val Met Leu 
200 

gac gga gag 
Asp Gly Glu 
215 

tgc gtg ttg 
Cys Val Leu 



ggg tta cga 
Gly Leu Arg 



tat ggg ttg 
Tyr Gly Leu 
265 

cgt aca ccc 
Arg Thr Pro 
280 

tat gat gtg 
Tyr Asp Val 
295 

ggc aag gaa 
Gly Lys Glu 



ctt ggt cag 
Leu Gly Gin 



tgt ttg gca 
Cys Leu Ala 
345 



gag aac gac 
Glu Asn Asp 
170 

ctg ccg tac 
Leu Pro Tyr 



aca cac aag 
Thr His Lys 

aat ccg aac 
Asn Pro Asn 
220 

cct ctg ttt 
Pro Leu Phe 
235 

gtt ggt gcg 
Val Gly Ala 
250 

tta gag ctg 
Leu Glu Leu 



ate gta ttg 
He Val Leu 



tec tec att 
Ser Ser He 
300 

ctt gaa gat 
Leu Glu Asp 
315 

gga tat gga 
Gly Tyr Gly 
330 

ttt gee aag 
Phe Ala Lys 



atg ccc aat 
Met Pro Asn 
175 

teg tea ggg 
Ser Ser Gly 
190 

gga caa gtg 
Gly Gin Val 
205 

ctg tat ata 
Leu Tyr He 



cac ate tac 
His He Tyr 



gcg att ctg 
Ala He Leu 
255 

gtc aga agt 
Val Arg Ser 
270 

gca ate tec 
Ala He Ser 
285 

egg act gtc 
Arg Thr Val 



tct gtc aga 
Ser Val Arg 



atg acg gag 
Met Thr Glu 
335 

gaa ggg ttt 
Glu Gly Phe 
350 



gta 587 
Val 



acg 635 
Thr 



acg 683 
Thr 



cat 731 
His 



teg 779 

Ser 

240 

att 827 
He 



aca 875 
Thr 



aag 923 
Lys 



atg 971 
Met 



get 1019 

Ala 

320 

gca 1067 
Ala 



gaa 1115 
Glu 
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ata aaa teg ggg gca tct gga act gtt tta agq aac gca cag atg aag 1163 
lie Lys Ser Gly Ala Ser Gly Thr Val Leu Arg Asn Ala Gin Met Lys 
355 360 365 

att gtg gac cct gaa acc ggt gtc act etc cct cga aac caa ccc gga 1211 
lie Val Asp Pro Glu Thr Gly Val Thr Leu Pro Arg Asn Gin Pro Gly 
370 375 380 

gag att tgc att aga gga gac caa ate atg aaa ggt tat ctt aat gat 1259 
Glu lie Cys lie Arg Gly Asp Gin lie Met Lys Gly Tyr Leu Asn Asp 
385 390 395 400 

cct gag gcg acg gag aga acc ata gac aag gaa ggt tgg tta cac aca 1307 
Pro Glu Ala Thr Glu Arg Thr He Asp Lys Glu Gly Trp Leu His Thr 
405 410 415 

ggt gat gtg ggc tac ate gac gat gac act gag etc ttc att gtt gat 1355 
Gly Asp Val Gly Tyr He Asp Asp Asp Thr Glu Leu Phe He Val Asp 
420 425 430 

egg ttg aag gaa ctg ate aaa tac aaa ggg ttt cag gtg gca ccc get 1403 
Arg Leu Lys Glu Leu He Lys Tyr Lys Gly Phe Gin Val Ala Pro Ala 
435 440 445 

gag ctt gag gee atg etc etc aac cat ccc aac ate tct gat get gec 1451 
Glu Leu Glu Ala Met Leu Leu Asn His Pro Asn He Ser Asp Ala Ala 
450 455 460 

gtc gtc cca atg aaa gac gat gaa get gga gag etc cct gtg gcg ttt 1499 
Val Val Pro Met Lys Asp Asp Glu Ala Gly Glu Leu Pro Val Ala Phe 
465 470 475 480 

gtt gta aga tea gat ggt tct cag ata tec gag get gaa ate agg caa 154 7 
Val Val Arg Ser Asp Gly Ser Gin He Ser Glu Ala Glu He Arg Gin 
485 490 495 

tac ate gca aaa cag gtg gtt ttt tat aaa aga ata cat cgc gta ttt 1595 
Tyr He Ala Lys Gin Val Val Phe Tyr Lys Arg He His Arg Val Phe 
500 505 510 

ttc gtc gaa gee att cct aaa gcg ccc tct ggc aaa ate ttg egg aag 164 3 
Phe Val Glu Ala He Pro Lys Ala Pro Ser Gly Lys He Leu Arg Lys 
515 520 525 

gac ctg aga gee aaa ttg gcg tct ggt ctt ccc aat taattctcat 1689 
Asp Leu Arg Ala Lys Leu Ala Ser Gly Leu Pro Asn 
530 535 540 
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tcgctaccct cctttctctt atcatacgcc aacacgaacg aagaggctca attaaacgct 1749 

gctcattcga agcggctcaa ttaaagctgc tcattcatgt ccaccgagtg ggcagcctgt 1809 

cttgttggga tgttctttca tttgattcag ctgtgagaag ccagaccctc attatttatt 1869 

gtgaaattca caagaatgtc tgtaaatcga tgttgtgagt gatgggtttc aaaacacttt 1929 

tgacattgt.t tacgttgtat ttcctgctgt tgaaaataac tactttgtat gacttttatt 1989 

tgggaagata acctttcaaa aaaaaaaaaa aaaaaa 2025 



<210> 8 
<211> 540 
<212> PRT 

<213> Liquidambar styraciflua 
<400> 8 

Met Glu Thr Gin Thr Lys Gin Glu Glu He He Tyr Arg Ser Lys Leu 
15 10 15 

Pro Asp He Tyr He Pro Lys His Leu Pro Leu His Ser Tyr Cys Phe 
20 25 30 

Glu Asn He Ser Gin Phe Gly Ser Arg Pro Cys Leu He Asn Gly Ala 
35 40 45 

Thr Gly Lys Tyr Tyr Thr Tyr Ala Glu Val Glu Leu He Ala Arg Lys 
50 55 60 

Val Ala Ser Gly Leu Asn Lys Leu Gly Val Arg Gin Gly Asp He He 
65 70 75 80 

Met Leu Leu Leu Pro Asn Ser Pro Glu Phe Val Phe Ser He Leu Gly 
85 90 95 

Ala Ser Tyr Arg Gly Ala Ala Ala Thr Ala Ala Asn Pro Phe Tyr Thr 
100 105 HO 

Pro Ala Glu He Arg Lys Gin Ala Lys Thr Ser Asn Ala Arg Leu He 
115 120 125 

He Thr His Ala Cys Tyr Tyr Glu Lys Val Lys Asp Leu Val Glu Glu 
130 135 140 

Asn Val Ala Lys He He Cys He Asp Ser Pro Pro Asp Gly Cys Leu 
145 150 155 160 
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His Phe Ser Glu Leu Ser Glu Ala Asp Glu Asn Asp Met Pro Asn Val 
165 170 175 

Glu He Asp Pro Asp Asp Val Val Ala Leu Pro Tyr Ser Ser Gly Thr 
180 185 190 

Thr Gly Leu Pro Lys Gly Val Met Leu Thr His Lys Gly Gin Val Thr 
195 200 205 

Ser Val Ala Gin Gin Val Asp Gly Glu Asn Pro Asn Leu Tyr He His 
210 215 220 

Ser Glu Asp Val Val Leu Cys Val Leu Pro Leu Phe His He Tyr Ser 
225 230 235 240 

Met Asn Val Met Phe Cys Gly Leu Arg Val Gly Ala Ala He Leu He 
245 250 255 

Met Gin Lys Phe Glu lie Tyr Gly Leu Leu Glu Leu Val Arg Ser Thr 
260 265 270 

Gly Asp His His Ala Tyr Arg Thr Pro He Val Leu Ala He Ser Lys 
275 280 285 

Thr Pro Asp Leu His Asn Tyr Asp Val Ser Ser He Arg Thr Val Met 
290 295 300 

Ser Gly Ala Ala Pro Leu Gly Lys Glu Leu Glu Asp Ser Val Arg Ala 
305 310 315 320 

Lys Phe Pro Thr Ala Lys Leu Gly Gin Gly Tyr Gly Met Thr Glu Ala 
325 330 335 

Gly Pro Val Leu Ala Met Cys Leu Ala Phe Ala Lys Glu Gly Phe Glu 
340 345 350 

He Lys Ser Gly Ala Ser Gly Thr Val Leu Arg Asn Ala Gin Met Lys 
355 360 365 

He Val Asp Pro Glu Thr Gly Val Thr Leu Pro Arg Asn Gin Pro Gly 
370 375 380 

Glu He Cys He Arg Gly Asp Gin He Met Lys Gly Tyr Leu Asn Asp 
385 390 395 400 

Pro Glu Ala Thr Glu Arg Thr He Asp Lys Glu Gly Trp Leu His Thr 
405 410 415 
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Gly Asp Val Gly Tyr lie Asp Asp Asp Thr Glu Leu Phe lie Val Asp 
420 425 430 

Arg Leu Lys Glu Leu lie Lys Tyr Lys Gly Phe Gin Val Ala Pro Ala 
435 440 445 

Glu Leu Glu Ala Met Leu Leu Asn His Pro Asn lie Ser Asp Ala Ala 
450 455 460 

Val Val Pro Met Lys Asp Asp Glu Ala Gly Glu Leu Pro Val Ala Phe 
465 470 475 480 

Val Val Arg Ser Asp Gly Ser Gin lie Ser Glu Ala Glu lie Arg Gin 
485 490 495 

Tyr He Ala Lys Gin Val Val Phe Tyr Lys Arg He His Arg Val Phe 
500 505 510 

Phe Val Glu Ala He Pro Lys Ala Pro Ser Gly Lys He Leu Arg Lys 
515 520 525 

Asp Leu Arg Ala Lys Leu Ala Ser Gly Leu Pro Asn 
530 535 540 



<210> 9 

<211> 1544 

<212> DNA 

<213> Pinus taeda 

<400> 9 

aaagataata tatgtgtatg cctactacta cacattgttt tgaagtgtgt aaacatagtg 60 

caacactagg aggactcaca atgagcactt gttgacatga aactagctaa atgcccaaca 120 

atattagtga aagctagtta aactaacccc tttgactttc aagatgatat atttatatcc 180 

ctactacgtc ttcctctttt tgtctttctc ttgtgattaa accttccttg aaacaattct 240 

caaatgtaaa attaaacctt gaaacttgta gagaccaaac ttccctagga gaaaccacat 300 

ttatgacaac atatatacac caacccattg catactataa tattggaatt acctgcagcg 360 

aacgaaagaa acgctgtctc accaactcgt gcactacatc ccgaaactta accttcccct 4 20 

gatacagatt gaa'gagccga aaaaagcgtg catccaaatt tctggtatgg tgaggagccg 4 80 
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aaaaacgcgt gcgcctaatt tttttgagat 
ttcacgtgtc gcgtattggc gaggttgcgc 
attccattgg ttgacccgcc ggtaccgcga 
gtggatcagc actgagaaga ttagatgatg 
gggtggttgg caagtacgcg acaaagaggg 
ataatattac aaagtgggtt ggtgggcatg 
tccgtgcaaa ttctgaccag tagtttgaac 
tgaagtgggt aaggagaatt gaacttacgt 
aacacatacc tttaactaat aaaaataccc 
cagaccttca actaataaga tagccatcag 
tgacttcaac caactaagat acccatcaaa 
cttaccagac caaccaagca gacctacgcc 
gtgccaccgt tgaagaatgg cactcagggt 
gtttggtgga gacggcgtgt ttgaatgtcc 
gcttatatta ggcctggatc tcttgtttca 
attcaagaat tcaattgccc tgccctgctc 
gctctggttt gttcaatttc ttgacccctg 
cgattatata agtcattttg gatccttgca 

<210> 10 

<211> 659 

<212> DNA 

<213> Pinus taeda 

<400> 10 

aaacaccaat ttaatgggat ttcagatttg 
ttattgtaat ctaaccaatt ctaatttcca 
ccgaaaacag cgaatgaaat gtctgggtga 

22 



gggccggaaa ataatgcgtg catctaaatt 540 
tgaatgtgat cctgtgcgtg agccacattc 600 
ggaccgtggg gtctcacaga tacgcggatg 660 
accaggcggg catttgaagt aaaaacttgg 720 
gtagtgcgca aggaagcgag ttggatgcaa 780 
agcatcaacc agaatgatgt tgttgctggt 840 
aatactaccc aacttgtttt tggtaaaaca 900 
ctcatggtaa agggcaaggg caaatgactt 960 
ctaacaaata cgaaaacgaa tgagttatca 1020 
acccacatct cctgactgac caaaaacaaa 1080 
gctaacccac aacccaattc ctcacttccc 1140 
attaactact ttaggacgtg ggaattgggg 1200 
tggtaatccc tccacgtgta tgtagcagtc 1260 
accttccagt ttggagaaca aggaaattgg 1320 
gagcaggagt agttcaggac aggaactagc 1380 
tgctctgctt tgctcaactt attgatccct 1440 
ctgggttctg ctctggtttg cacactttct 1500 
aggaagagaa tatg 154 4 



tatcccatgc tattggctaa ggcatttttc 60 
ccctggtgtg aactgactga caaatgcggt 120 
tcggtcaaac aagcggtggg cgagagagcg 180 
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cgggtgttgg cctagccggg atgggggtag gtagacggcg tattaccggc gagttgtccg 240 

aatggagttt tcggggtagg tagtaacgta gacgtcaatg gaaaaagtca taatctccgt 300 

caaaaatcca accgctcctt cacatcgcag agttggtggc cacgggaccc tccacccact 360 

cactcaatcg atcgcctgcc gtggttgccc attattcaac catacgccac ttgactcttc 420 

accaacaatt ccaggccggc tttctataca atgtactgca caggaaaatc caatataaaa 480 

agccggcctc tgcttccttc tcagtagccc ccagctcatt caattcttcc cactgcaggc 540 

tacatttgtc agacacgttt tccgccattt ttcgcctgtt tctgcggaga atttgatcag 600 

gttcggattg ggattgaatc aattgaaagg tttttatttt cagtatttcg atcgccatg 659 

<210> 11 

<211> 2251 

<212> DNA 

<213> Pinus taeda 

<400> 11 

ggccgggtgg tgacatttat tcataaattc atctcaaaac aagaaggatt tacaaaaata 60 

aaagaaaaca aaattttcat ctttaacata attataattg tgttcacaaa attcaaactt 120 

aaacccttaa tataaagaat ttctttcaac aatacacttt aatcacaact tcttcaatca 180 

caacctcctc caacaaaatt aaaatagatt aataaataaa taaacttaac tatttaaaaa 240 

aaaatattat acaaaattta ttaaaacttc aaaataaaca aactttttat acaaaattca 300 

tcaaaacttt aaaataaagc taaacactga aaatgtgagt acatttaaaa ggacgctgat 360 

cacaaaaatt ttgaaaacat aaacaaactt gaaactctac cttttaagaa tgagtttgtc 420 

gtctcattaa ctcattagtt ttatagttcg aatccaatta acgtatcttt tattttatgg 480 

aataagggtg ttttaataag tgattttggg atttttttag taatttattt gtgatatgtt 540 

atggagtttt taaaaatata tatatatata tatatttttg ggttgagttt acttaaaatt 600 

tggaaaaggt tggtaagaac tataaattga gttgtgaatg agtgttttat ggatttttta 660 

agatgttaaa tttatatatg taattaaaat tttattttga ataacaaaaa ttataattgg 720 
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ataaaaaatt gttttgttaa 
tattattttt aaaaaatttg 
aaattaattt taaattaata 
taaatctatt ttgcattcaa 
caatttgtat aaaaaccaaa 
tatgatttca agaaagacaa 
attaagatct cattaattaa 
attgtatcca agaaatatag 
caaaatcatt acattaaagc 
agaaatatag aatgttctcg 
acattaaagc tcatcatgtc 
ttctctcaat ctcccaaaat 
tacccaataa tatatttttt 
tttattggaa tgaaggttga 
agatactaaa tccattatat 
gatttgtatc ccatgctatt 
tttccaccct ggtgtgaact 
gggtgatcgg tcaaacaagc 
gggtaggtag acggcgtatt 
aacgtagacg tcaatggaaa 
tcgcagagtt ggtggccacg 
ccattattca accatacgcc 
caatgtactg cacaggaaaa 
ccccagctca ttcaattctt 
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atttagagta aaaatttcaa aatctaaaat aattaaacac 780 
ttggtaaatt ttatcttata tttaagttaa aatttagaaa 840 
aacttttgaa gtcaaatatt ccaaatattt tccaaaatat 900 
aatacaattt aaataataaa acttcatgga atagattaac 960 
aatctcaaat aaaatttaaa ttacaaaaca ttatcaacat 1020 
taaccagttt ccaataaaat aaaaaacctc atggcccgta 1080 
ttcttatttt ttaatttttt tacatagaaa atatctttat 1140 
aatgttctcg tccagggact attaatctcc aaacaagttt 1200 
tcatcatgtc atttgtggat tggaaattat attgtataag 1260 
tctagggact attaatttcc aaacaaattt caaaatcatt 1320 
atttgtggat tggaaattag acaaaaaaaa tcccaaatat 1380 
atagttcgaa ctccatattt ttggaaattg agaatttttt 1440 
tatacatttt agagattttc cagacatatt tgctctggga 1500 
gttataaact ttcagtaatc caagtatctt cggtttttga 1560 
aataaaaaca cattttaaac accaatttaa tgggatttca 1620. 
ggctaaggca tttttcttat tgtaatctaa ccaattctaa 1680 
gactgacaaa tgcggtccga aaacagcgaa tgaaatgtct 1740 
ggtgggcgag agagcgcggg tgttggccta gccgggatgg 1800 
accggcgagt tgtccgaatg gagttttcgg ggtaggtagt 1860 
aagtcataat ctccgtcaaa aatccaaccg ctccttcaca 1920 
ggaccctcca cccactcact cgatcgcctg ccgtggttgc 1980 
acttgactct tcaccaacaa ttccaggccg gctttctata 2040 
tccaatataa aaagccggcc tctgcttcct tctcagtagc 2100 
cccactgcag gctacatttg tcagacacgt tttccgccat 2160 

24 
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ttttcgcctg tttctgcgga gaatttgatc aggttcggat tggqattgaa tcaattgaaa 2220 
ggtttttatt ttcagtattt cgatcgccat g 2251 
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